EffectGames.com | Joe's Blog
My HTML5 / CSS3 / Browser Wish List
In no particular order, here's what I'd like to see in the HTML5/CSS3 specifications, and/or implemented in new browser versions.
If the JavaScript thread is initiated from a click event (mouseDown, mouseUp, etc.), it can call HTMLElement.fullScreen() to "zoom" the element to fit the entire screen, covering all windows, the menu bar and any other OS widgets. If multiple screens are attached to the machine, only the screen containing the mouse cursor is fullscreened. Example:
<div id="game" style="width:640px; height:480px;"></div>
<a href="#" onClick="document.getElementById('game').fullScreen();">Fullscreen</a>
It is up to the developer to resize (or scale) the DOM element as needed (or else it is just centered in the screen with a black background), by grabbing the screen.width and screen.height globals. The developer could apply a CSS scale transform, or just change the DOM element's width and height properties as needed.
Pressing ESC (or any key combination that would normally close the window/tab, or switch out of the app) cancels the fullscreen and returns to the normal window view (the browser may also implement a "close" or "cancel" floating visible button in a corner of the screen if desired). At this time the DOM element is sent an event, "cancelFullScreen" which can be listened for and acted upon (i.e. resize/scale the element back to normal size).
During fullscreen mode, the rest of the page is still accessible via the DOM, but all elements are "offscreen" and not visible (except of course for child elements inside the target element). No scrollbar is displayed, even if the page is larger than the screen. If a new window or tab is opened while in fullscreen mode, normal windowed mode is restored. Technically the developer could call fullScreen() on the BODY element, and then manipulate the scrollTop property to achieve fullscreen scrolling, if desired.
To programmatically cancel fullscreen mode, the developer can just call window.close().
Update: I realize the HTML5 specification is against allowing video elements to go fullscreen, with the stated reason being one of security. I strongly disagree with this, and think it should be allowed on any DOM element, as long as the code thread that initiates the fullscreen mode originates from a mouse click. If there ia a huge outcry over this, the browser could simply prompt the user the first time for each unique page URL, asking if it is okay if the page goes fullscreen. The user may click "Yes" and optionally check a "Remember This" checkbox, so he/she is never bothered by that page again.
Update: Looks like Firefox will support this sometime after v4 is released: https://wiki.mozilla.org/Gecko:FullScreenAPI
Update 2: Looks like Webkit will get this sometime soon! http://trac.webkit.org/changeset/66251
By "streaming" I don't mean HTTP pseudo-streaming of a file sitting on a server. I mean streaming live broadcasts like Flash can do now. But it doesn't have to be some crazy custom protocol like Adobe's media server. Let's make it happen over plain HTTP, with simple extensions to headers and whatnot for time sync. It could be as simple as this:
For the first request, just send a simple HTTP GET for the audio/video live stream URI. The server will send back a chunk of media from the live stream, plus an identifier in a HTTP response header. The media will always begin with a keyframe (for video), so it can play as an independent video file. Each chunk should effectively be its own standalone file, with header info like bitrate, codec, etc. Example HTTP request:
GET /video/live/stream HTTP/1.1
Host: somehost.com
Connection: Keep-Alive
Accept: video/mpeg4, video/ogg, video/webm
X-Desired-Bitrate: 400K/sec
Example response (sans binary data):
HTTP/1.1 200 OK
Content-Length: 338844
Content-Type: video/mpeg4
X-Media-Identifier: 4eed6292a75af286531e0463cc57910e
The X-Media-Identifier identifies this media chunk in the stream, and is then sent in the next request, so the server knows the timecode, and what chunk to send next. The browser can buffer as many chunks as necessary, play them in sequence (they need to seamlessly play together), and then fetches new chunks as the buffer dries up (grabbing and storing the media identifier each time).
If there is a network error or large delay (or the user pauses the video then resumes), the server can just ignore the media identifier and send back the "latest" chunk, hot off the stream. This may cause a "blip" or "cut" in the media, but is necessary to get the player back on track, seeing the latest content. This should be a rare occurrence. Simiarly, if the video source itself is lagged or pauses, the server could send back an "empty" chunk containing nothing but black pixels (but still contain its own unique media ID), until the video source resumes. This way the "player" doesn't have to code any special logic for that case.
Why do it this way? Well, if each "chunk" of the live media stream is effectively its own standalone file (but seamlessly welds into the next one in the chain), then a live streaming "player" becomes a very easy piece of code to write indeed. It just needs to chain together a bunch of little media files, buffering a few for network hiccups. Heck, it could almost be done in JavaScript with AJAX calls. That gives me an idea... can binary video data be base64-encoded into a data URL and set on a media element's "src" attribute? That would be sweet.
Worried about a new HTTP connection for each video stream chunk? Don't. HTTP Keep-Alives to the rescue! Using Keep-Alives will make the system reuse the same HTTP connection for multiple requests. Also, "streaming" chunks like this over HTTP makes it play nice with Content Delivery Networks (CDNs) for delivering video to a large audience.
HTML5 Audio currently has no way of panning audio tracks left or right (adjusting the balance). Recommend adding a HTMLMediaElement.balance property, floating point decimal, with a range of "-1" (full left speaker), to "0.0" (center, both channels at full volume), to "1.0" (full right speaker). This should work on both mono and stereo tracks.
var sound = new Audio('env/water-loop.mp3');
sound.balance = -1; // full left speaker
sound.play();
Changing this value should have an immediate effect on audio that is already playing -- i.e. balance should not be buffered.
It would be nice to start an HTML5 Audio "stream" that doesn't have a predefined set of audio samples (like from an audio file or net stream), but rather generated from scratch by JavaScript code in real-time. Simply supply a callback function to "fill" a buffer with samples, which are then played as quickly as possible. The audio system can use double-buffering, and call the function several times before beginning play, to insure enough samples are available. Obviously, the sound samples are supplied in uncompressed format. Example:
var stream = new AudioStream();
stream.channels = 2; // stereo
stream.bits = 16; // 16-bit sound
stream.sampleRate = 44100; // samples per second

var angle = 0;

stream.addEventListener( 'fillBuffer', function(buffer) {
   for (var idx = 0, len = buffer.length; idx < len; idx++) {
      buffer[idx][0] = sin(angle); // left channel sample
      buffer[idx][1] = cos(angle++); // right channel sample
   }
}, false );

stream.play();
The buffer size is entirely controlled by the audio system. The JavaScript code needs to check buffer.length to know how many samples to generate per call.
The samples are set in the buffer array as floating point numbers between -1.0 and 1.0, and are then auto-converted into N-bit integers by the native audio system. This way the number of bits can be changed without affecting code, such as changing to 8-, 24- or 32-bit audio. Alternatively, there could be a property in the object to request true integers in the buffer, as to not mess around with floating point accuracy.
How about allowing the "fillBuffer" event to be registered on a regular Audio object, pointing to an existing audio file, with the buffer already pre-filled with the decompressed audio ready to play. Then the JavaScript code could effectively be used to "filter" the audio in real-time, just by manipulating the existing samples. For example, consider a reverb or chorus filter.
Update: Looks like Firefox supports this: https://wiki.mozilla.org/Audio_Data_API
This would require a dialog requesing access from the user, but would allow JavaScript to capture live audio from whatever source the user's PC has handy, like a microphone or line input. The browser would have to present a dialog box which would ask the user for access, and allow him/her to select an audio source (simple drop-down menu would be fine, listing the audio devices connected to the machine). Perhaps a checkbox to "remember" the setting for subsequent visits to the page (plus a way to revoke this later in the prefs).
var recorder = new AudioRecorder();
recorder.channels = 2; // stereo (if available)
recorder.bits = 16; // 16-bit sound
recorder.sampleRate = 44100; // samples per second
recorder.gain = 1.0; // preamplify or attenuate

recorder.addEventListener( 'receiveSamples', function(buffer) {
   for (var idx = 0, len = buffer.length; idx < len; idx++) {
      // buffer[idx][0] == left or mono channel sample
      // buffer[idx][1] == right channel sample (if stereo)
   }
}, false );

recorder.record();
I suppose the record() function would block until the user granted access (or was previously allowed), and could return false if the user denied access. In the event listener, a buffer of samples collected from the audio input is provided, which can be used in a variety of ways. The sample are uncompressed, which is great for playing back in real time, but bad for sending to a server. How about a way to compress the audio in real-time? Example:
var recorder = new AudioRecorder();
recorder.codec = 'mp3'; // or ogg, etc.
recorder.bitRate = 128; // 12.8K per second
recorder.gain = 1.0; // preamplify or attenuate

recorder.addEventListener( 'receiveAudio', function(buffer) {
   // buffer is now a "string" containing binary compressed MP3 data
   // can be sent to a server via AJAX or WebSocket
   // will effectively be a standalone audio file that could play by itself
}, false );

recorder.record();
Similar to my live audio/video streaming feature request, each call to the event listener includes a full "standalone" audio file that could be played by itself, but if played in sequence with subsequent files, would result in a seamless stream. To make this work, the encoder has to "end" each buffer chunk on a keyframe or audio "frame" boundary. This is to avoid clicks and pops often seen with "looping" MP3 files. Therefore, the buffer size is codec specific.
One could theoretically implement both listeners on a AudioRecorder object, to first manipulate the samples in your receiveSamples listener (meaning the buffer array is read/write), and then have it compressed and sent to a server via your receiveAudio listener. Using this technique, one could implement real-time audio manipulation while it is being recorded, such as vocoders, pitch bending, reverb, and whatnot.
I suppose this opens up a new world of MP3 licensing hurt, so scratch that encoding format in the example above, and assume I meant to say some other royalty free audio codec that browsers can use for encoding audio. Flash does it now, so there has got to be a good codec (WebM must have an audio layer, right?).
Taking the previous feature request one step further, we could allow full audio and video capture from a Webcam or other video camera attached to the machine. Like audio capture, this would require permission from the user via a dialog, allowing them to select a video device and optionally remembering the selection for subsequent uses by the same page URL.
I can see two different uses for this: capturing video for immediate client-side use of the pixels in the browser (like, render it onto a Canvas element), or sending it to a server for a live video stream. For the former, we should provide the raw video pixels per frame, in a format compatible with Canvas getImageData() / putImageData().
var webcam = new VideoRecorder();
webcam.width = 320; // video width in pixels
webcam.height = 240; // video height in pixels
webcam.frameRate = 30; // frames per second

webcam.addEventListener( 'receivePixels', function(pixels) {
   // pixels is an array of 320 x 240 x 3 elements (24-bit RGB)
   // feel free to call myCanvas.putImageData(pixels) with this
}, false );

webcam.record();
As for sending live video to a server, we need a different kind of callback which would receive a chunk of binary compressed audio/video rather than raw pixels, and to specify codecs for both the video and audio. Example:
var webcam = new VideoRecorder();
webcam.width = 320; // video width in pixels
webcam.height = 240; // video height in pixels
webcam.frameRate = 30; // frames per second
webcam.codec = 'ogg'; // video codec
webcam.bitRate = 512; // video Kbits/sec

// might as well use our AudioRecorder() class here, and "attach" it to the video
// since the audio is part of, and sent with, the video stream
// omit this for "silent" video
webcam.recorder = new AudioRecorder();
webcam.recorder.codec = 'ogg'; // audio codec
webcam.recorder.bitRate = 128; // audio Kbits/sec
webcam.recorder.gain = 1.0; // preamplify or attenuate

webcam.addEventListener( 'receiveVideo', function(buffer) {
   // buffer is now a "string" containing binary compressed OGG video (+audio)
   // can be sent to a server via AJAX or WebSocket
   // will effectively be a standalone video file that could play by itself
}, false );

webcam.record();
To show a video "preview" as it is being recorded and streamed to the server, feel free to register both listeners ("receivePixels" and "receiveVideo") on the object, and use the former to render the pixels into a Canvas or something (called every frame), while the latter sends the binary compressed data to the server at regular (much less frequent) intervals.
In fact, as to not overwhelm your server with simultaneous video chunk uploads, you could implement a simple "queue" mechanism in JavaScript, pushing the chunks onto an array first, then shifting them off one at a time, and sending the to server in series. The sending mechanism would be controlled via a separate timer.
This video stream format is "compatible" with the streaming feature I describe above, as each chunk is a standalone file, with some additiional server logic required to store the chunks separately and serve them up with those special media identifiers. The server would have to host the latest N chunks, discarding older ones, assign each an ID, and know the sequence in which they need to be served back. Most importantly of all, the server software doesn't need to know anything about video codecs, or deal with raw video binary data encoding or decoding at all. It is just serving up "files" which make up the stream.
Update: Looks like this will be available someday soon, via the new HTML Device Element.
It would be nice to capture a screenshot of the current browser window contents (or any DOM element) into an array of pixels, suitable for displaying in a Canvas element or submitting to a server. This would make possible some very interesting visual effects and apps. For example, consider the case where a dialog must be displayed. The developer could screenshot the window, draw into and overlay the canvas element and "modify" the pixels for a blur effect, or some other transform to make the contents appear to be in the background, then show the dialog on top. When the dialog is dismissed, simply remove the Canvas element, restoring the original page image.
If we add a getImageData() method to the HTMLElement class, then any DOM element can be "snapshotted", which would include child elements.
var body = document.getElementsByTagName('body')[0];
var pixels = body.getImageData(0, 0, body.offsetWidth, body.offsetHeight);

var canvas = document.createElement('canvas');
canvas.width = body.offsetWidth;
canvas.height = body.offsetHeight;

var context = canvas.getContext('2d');
context.putImageData( pixels );
body.appendChild( canvas );

var pngdata = canvas.toDataURL('image/png');
Also, this could be a nifty trick to generate a JPEG or PNG image of the browser window, to submit to a server. Imagine a web based bug reporting system with a bookmarklet widget that allowed the user to send a snapshot of another window to a server and attach to a bug. Or imagine an HTML5 game where the user can "save" his/her progress, and the saved game comes with a thumbnail screenshot of the actual game screen!
For security, cross-domain content should be exempt from the screenshot (I'm thinking primarily about IFRAMEs -- cross-domain images are fine). This way banner ads running in IFRAMES cannot capture the outer window content. If your banner ads are hosted in the same domain as your page, then shame on you. Also, this would prevent a malicious script from rendering an IFRAME pointed at, say, bankofamerica.com, and capuring the content therein.
Update: Looks like Firefox deliberately does not support this due to security concerns.
It would be nice to be able to control the rendering mode (composite operation) of any DOM element via CSS, which would affect how they are superimposed on elements or backgrounds under them. For example, right now everything is superimposed in "normal" mode, but we are allowed to set the opacity. That's a good start, but imagine also being able to set the blend mode (like in a Photoshop layer), to things like Dissolve, Screen, Darken, Multiply, Add, Subtract, Difference, and more. These are mathematical algoritms that control the final rendered visible pixels given the layer content and the background behind it. This is very similar to the Canvas globalCompositeOperation, but would be available for any DOM element, specified via CSS.
div.sprite {
   position: absolute;
   z-index: 2;
   composite: add;
}
This would allow for some crazy cool effects, especially in games.
Update: Looks like this might be possible in Firefox through the use of SVG filters: https://developer.mozilla.org/en/applying_svg_effects_to_html_content
Right now it is up to the browser how CSS transforms are "interpolated", which means how pixels are rendered when the transform distorts the element. So pixels either have to be blurred, averaged, or otherwise interpolated. It would be really nice to be able to control this, because sometimes we want "nearest neighbor" interpolation (imagine a painting app where the document can be "magnified" like Photoshop's zoom tool). Or in games, perhaps bicubic interpolation is too slow for hundreds of sprites, and bilinear would be more desirable.
div.sprite {
   transform: rotate(45deg);
   transform-interpolation: bilinear;
}
Update: Looks like Firefox mostly supports this via its "image-rendering" CSS property. Now we just need Webkit to hop on board, and/or get this in the next CSS specification.
Right now we have various ways to transform DOM elements with CSS: translate, scale, rotate and skew. What would be really cool is to just specify the four corners of the bounding rectangle as independent 2D points in space, and have the browser "distort to fit" the shape. This way we can achieve any disortion imaginable.
div.sprite {
   transorm: distort-top-left(0px 0px) distort-top-right(50px 0px) distort-bottom-right(50px 50px) distort-bottom-left(0px 50px);
}
This allows for complete control over the bounding rectangle corner points of the DOM element, so any transform is technically possible. By specifying the four corners as raw points (these should be interpreted as offsets from the natural top-left of the object), we can actually achieve all the built-in transforms (translate, scale, rotate, skew), plus anything else we desire.
Safari has the ability to perform "3D" transforms, effectively giving DOM elements a "z" coordinate which simulates depth. Extending our new distort transform a bit, we could simply support an optional 3rd pixel coordinate which would be the "z" position in virtual 3D space.
Right now, Safari seems to be the only browser doing partial hardware acceleration on some objects, if and only if their 3D transform system is used. Instead of hiding hardware acceleration in an easter egg like this, let's make if official and have elements request it as needed. For instance, a perfect candidate is a container DIV for an HTML5 game, which will have tons of animation going on inside it (absolutley positioned child elements for sprites and tiles).
div#game {
   position: relative;
   width: 640px; height: 480px;
   rendering: hardware-accelerate;
}
The element must have a hard pixel width and height (like a Canvas element), so the browser can generate a GPU texture map at the target size.
This should be taken as more of a "request" than a "demand". Meaning, this could be abused and people may just start setting it on the BODY tag, forcing the entire page to be rendered in hardware. While this may work for some websites, doing so may use up all the VRAM and ultimately slow down the user's PC. It is up to the browser (and graphics hardware) to figure out if hardware acceleration is warranted (given the size of the object) and how much VRAM is available.
Taking this one step further, such hardware accelerated objects may also benefit from manual control over rendering or "reflow". Meaning, instead of the browser deciding when to redraw objects after they move (or new child elements added/removed), a game may have its own timer system and would want to manually specify when this reflow occurs. How about:
div#game {
   position: relative;
   width: 640px; height: 480px;
   rendering: hardware-accelerate;
   redraw: manual;
}
Then, when the game's main event loop has updated all the game object positions and wants to trigger a redraw, it is requested manually using a new HTMLElement.redraw() method:
document.getElementById('game').redraw();
I have seen first hand how all the current browsers will sometimes fail to cache HTML5 Audio and Video files (even if the HTTP headers say to do so), and instead just reload them from the server over and over, and sometimes cache them at first then randomly "flush" them, even if the page is still active, so they have to be downloaded again. This wreaks havoc on things like game engines. A game engine relies on media like audio clips to be immediately available at all times, for sound effects that must play without latency.
It would be great to give the browser a hint as to which HTML5 MediaElements are extremely important, and should not be flushed from the cache unless there is absolutely no free memory. Also, these elements should be cached in RAM whenever possible, as opposed to on disk. How about HTMLMediaElement.priority (integer).
var sound = new Audio();
sound.src = 'sounds/sfx/shoot.mp3';
sound.priority = 1; // highest priority
sound.load();
The priority property serves two purposes. First, it allows the web developer to rank their media elements, so those with lower priorities (higher numbers) will be flushed first, and those closer to "1" will be flushed last. Also, anything with a priority level of "1" should never be flushed while the page is still active, unless absolutely necessary (i.e. the OS runs out of memory).
Obviously the browser must treat this as a request rather than a demand, because a silly developer could load a 2GB movie and set its priority to "1". So the browser should only do this if memory is available, and the developer is not an idiot. Perhaps set a limit in the browser preferences, such as all combined media cached in RAM cannot exceed 100MB.
Currently, browsers only allow human input from a single "mouse" and a single "keyboard" as human interface devices (and of course touch events on mobile). But there are a plethora of other input devices available to PCs, from gamepads, to joysticks, to steering wheels (potentially with pedals), to those knob things, to those pressure-sensitive graphics tablets with styluses, not to mention Wiimotes. Wouldn't it be great if browsers had a way to read raw events from any of these devices, simply by registering the right event listeners? All modern operating systems already have a HID API for native apps, so why not pass this through to the browser DOM event system? Imagine this:
document.addEventListener( 'hidButtonDown', function(event) {
   // user pressed a button on a HID device
   event.hidName; // name of device, e.g. "Logitech USB Gamepad"
   event.hidControlID; // ID of button, e.g. "6"
} );

document.addEventListener( 'hidButtonUp', function(event) {
   // user released a button on a HID device
   event.hidName; // name of device, e.g. "Logitech USB Gamepad"
   event.hidControlID; // ID of button, e.g. "6"
} );
Simple, eh? The way I understand how the HID system works, is that every control on a HID device is given a numerical ID, including the 4-directional pad on gamepads (each of the 4 directions is a different ID), and any button or switch that has a simple "on" or "off" state. It would be up to the developer to provide a UI for assigning controls to actions in their app or game. Could be as simple as: walk through each control, display a dialog that says "press the key for Move Left", wait for HID event, assign to an object somewhere, repeat. These control assignments could be saved in local storage or on a server, and restored when the same user returns later.
Also, given a simple database of HID device names (e.g. "Logitech USB Gamepad"), an app or game could provide default assignments for all the controls on known devices, so users wouldn't have to manually assign controls (unless they wanted to customize them).
But what about analog sticks, pressure sensitive pedals, knobs and whatnot? No problem at all. Let's just define a different event type for those:
document.addEventListener( 'hidSliderChange', function(event) {
   // user moved an analog slider on a HID device
   event.hidName; // name of device, e.g. "Logitech USB Gamepad"
   event.hidControlID; // ID of slider control, e.g. "6"
   event.hidValue; // value of slider, from 0.0 to 1.0
} );
All the analog HID controls, including analog sticks on a gamepad, joysticks, steering wheels, pedals and knobs, can all be described as "sliders" which have a range from 0.0 to 1.0. Each time one of these changes, an "hidSliderChange" event is sent which can be caught and acted upon. For "2D" analog sticks that can move in a horizontal and vertical direction, there would be two separate slider IDs, one for the horizontal, and one for the vertical. Accelerometers like those found in the Wiimote and PS3 Sixaxis can be read in the same way.
Analog controls tend to vary, so the developer is encouraged to provide a UI allowing the user to "calibrate" their device. This would involve simply reading slider events while the user centered the slider, moved all the way to the left (and/or top), and all the way to the right (and/or bottom). In the case of a pedal this would mean press it all the way down, then let it back up. This way the developer could get a proper range of values (the analog control may not go all way the way to 0.0 and 1.0, and 0.5 may not be exactly centered).
A graphics tablet is a special input device, which doesn't exactly fit into the HID events we defined thus far. We need something a little more comprehensive, espcially considering the pixel precision needed, and pressure sensitivity.
document.addEventListener( 'hidTableTouchStart', function(event) {
   // user started a new event on the tablet
   event.hidName; // name of device, e.g. "Wacom Intuos4 Medium"
   event.hidX; // X coordinate of pen
   event.hidY; // Y coordinate of pen
   event.hidPressure; // pressure of pen, from 0.0 to 1.0
} );

document.addEventListener( 'hidTableTouchMove', function(event) {
   // user dragged the pen on the tablet
   event.hidName; // name of device, e.g. "Wacom Intuos4 Medium"
   event.hidX; // X coordinate of pen
   event.hidY; // Y coordinate of pen
   event.hidPressure; // pressure of pen, from 0.0 to 1.0
} );

document.addEventListener( 'hidTableTouchEnd', function(event) {
   // user released the pen from the tablet
   event.hidName; // name of device, e.g. "Wacom Intuos4 Medium"
   event.hidX; // X coordinate of pen
   event.hidY; // Y coordinate of pen
} );
The hidX and hidY values should be pre-scaled to screen coordinates.
So what about force feedback? Well, considering someone has a patent on it, we probably won't see this in browsers anytime soon. Yay patent system!
JavaScript rarely used to have to deal with binary data; it used to be all text and numbers. But now that things like Data URLs and Local File APIs are popping up, the need arises. Currently there is no good way to access and manipulate binary data stored in a string, and a JavaScript Array object is ill-equipped. What we need is an accelerated "ByteArray" class that allows array-style access to binary data, but also the ablity to fetch the data as a string. Adobe has a nice ByteArray implementation in ActionScript 3, but it has a complex API. How about something simple like:
var pngData = canvas.toDataURL('image/png');
var rawData = atob( pngData.substring( pngData.indexOf(',') + 1 ) );

var bytearray = new ByteArray( rawData );
bytearray[256] = 0;
bytearray.replace( 300, "some more binary data here" );

rawData = bytearray.getRawData();
pngData = "data:image/png;base64," + btoa( rawData );
The idea here is, the ByteArray object doesn't allocate any new memory, and just serves up the raw binary data, and allows it to be manipulated directly. This way accessing array elements and even calling getRawData() is extremely fast. The replace() method you see used in the example replaces a section of the binary data with another provided chunk (just overwrites the data at the specified offset). This would also be extremely fast -- just a memcpy() or whatnot.
Currently, the Canvas getImageData() method returns an array which has an element for each color component of each pixel. Meaning, each "pixel" in the canvas actually has 4 separate elements in the array (one for red, green, blue and alpha respectively), so the array is four times larger than the canvas has pixels. While this is convenient, it is highly unoptimized. It would be much preferable, in many cases, to receive an array where each pixel is a single 32-bit integer. That way, it only takes a single JavaScript array assignment to set a pixel, not four separate assignments. Performing 32-bit RGBA bitwise calculations is easy.
imageData = context.getImageData( 0, 0, 640, 480, 32 ); // request 32-bit integers
var pixels = imageData.data;

// precalculate a "white" opaque pixel as a 32-bit int
var alpha = 255, blue = 255, green = 255, red = 255;
var pixel = (alpha) + (blue << 8) + (green << 16) + (red << 24);

for (var idx = 0, len = pixels.length; idx < len; idx++) {
   pixels[idx] = pixel;
}

context.putImageData( imageData, 0, 0 );
In this example we have supplied a 5th argument to getImageData(), specifying we want 32-bit integers instead of 8-bit. This code is likely going to be about 4 times faster than normal, because we are only executing a single operation per pixel, instead of four.
More and more data is sent via Form Submits and XHR POSTs now-a-days, especially with the newfangled Data URLs and Local File APIs. Wouldn't it be nice if forms and XmlHttpRequest elements broadcast "progress" events, sort of like the new HTML5 Audio and Video elements do when loading data?
var form = document.getElementById('myform');

form.addEventListener( 'progress', function(event) {
   var loaded = event.loaded; // bytes sent so far
   var total = event.total; // total bytes to send
   var percent = (event.loaded / event.total) * 100;
}, false );

form.submit();
Using this, the page could display some sort of progress indicator, while uploading files or some other large dataset. The same listener could work on an XmlHttpRequest object in the same way. The browser should invoke a progress event several times per second.
It would be really nice to be able to create true native hierarchical (nested) menus in <select> elements. Here is one way it could work:
<select name="sections">
   <option value="blog">Blog</option>
   <option value="discussions">Discussions</option>
   
   <select name="games" label="Games">
      <option value="tetris">Tetris</option>
      <option value="mario">Mario</option>
   </select>
   
   <option value="about">About Us</option>
</select>
Notice the nested <select> element right alongside the normal <option> elements. This would show in the list of items as a hierarchical menu with the label "Games", and popup the two nested items therein. If one of the nested items were selected, its text would show visibly in the outer menu as the "selected" element, however the selectedIndex of the outer menu would be "2" (pointing at the "games" item), and the selectedIndex of the "games" menu would be whatever selection you made (as if the inner menu was standalone). If the form was then submitted, it would include both select fields in the POST data. Example:
sections=games&games=tetris
If one of the outer elements was chosen instead, the "games" field would not even be included in the form POST data submitted to the server. Example:
sections=blog
I am sick of trying to implement cross-browser solutions for centered content, especially when the width and height of my element need to be dynamic (not fixed). Wouldn't it be great if CSS simply allowed "vertical-align" on all elements, and added a "horizontal-align" attribute to match?
div.fullcenter {
   width: auto;
   height: auto;
   horizontal-align: middle;
   vertical-align: middle;
}
That would just make my day.
These are requests targeted at specific browsers (i.e. other browsers already have the features).
Safari and Chrome have so many problems with HTML5 Audio it would be folly to list them all here (see here). Specifically, I am referring to using the system for short sound effects (like interface clicks or video game effects). The problems range from sounds randomly not loading, randomly not playing, playing after a huge delay (1+ seconds), sounds falling out of cache and needing to be reloaded, and general instability. Currently, Flash is a much better bet for game sound, despite its horrible lag on Windows.
Also, Apple: Please add OGG and WebM support to Safari. Rather, I guess it should be: Apple: Please add OGG and WebM decoders to QuickTime. Why not?
In Firefox when you change the volume of a track that is currently playing, it takes up to 1 - 2 seconds for the change to be heard. I realize the audio must be buffered, but it seems like the volume changes are basically a "preamp" which is applied at the sample level, which then goes into the buffer. Instead, the volume property should control a real volume setting in the audio driver itself (post-buffering).
I see that the HTML5 spec defines the spellcheck attribute, and Firefox has had support for this since version 2. Safari/Chrome, please implement this!
A huge thank you goes out to the Webkit team for implementing CSS Masks, which are extremely awesome. Firefox, please, please implement this!
Update: Looks like this is possible through the use of SVG filters: https://developer.mozilla.org/en/applying_svg_effects_to_html_content. The question I have is, can you use an image as the mask?
Safari has taken the lead with awesome hardware accelerated graphics (activated via CSS 3D transforms). Firefox and Chrome, please implement this!
Update: Thank you Firefox team for implementing hardware acceleration in Firefox 4 for Windows! Can we Mac users get some love too? http://hacks.mozilla.org/2010/09/hardware-acceleration/