html [Proposal] ImageBitmap.getImageData

This is meant to address #3802, and supersedes #4748. More background info can be found on those two issues.

There is currently no high-performance code path for getting ImageData from a MediaStreamTrack or an HTMLVideoElement. The current idiom for doing so involves drawing to an HTMLCanvasElement (or OffscreenCanvas where available) and creating a new ImageData each time. This is particularly low-performance on lower spec devices.

For the purposes of this proposal, high-performance means:

Manual control over memory allocation/deallocation instead of leaving it to GC
Low enough latency to maintain reasonable (20+ FPS) frame rates

I have a repo with performance results testing the various methods of getting ImageData for webcam frames in the latest Chromium and Firefox, using a Raspberry Pi Model 3B+ as a low-spec, easily reproducible test bed. The low memory (1 GB) is of particular concern for this kind of test, and low disk performance means swapping at that kind of rate will lock up the device. Firefox fails all tests as it immediately runs the system out of memory when creating ImageData for 1080p video, presumably because its garbage collector is overwhelmed. This isn't entirely surprising, as at 30 FPS that's 240 MB/sec of garbage being generated. This is why a memory-efficient way of generating ImageData for webcam frames is crucial.

As opposed to #3802 and #4748, this proposal seeks to remove the need to use a canvas as an intermediary for getting ImageData, which provides a performance boost, and in providing a new API it can have changes which would otherwise be breaking for CanvasRenderingContext2D.getImageData.

The main usefulness in this proposed change is the ability to steal the memory from an existing ArrayBuffer when getting ImageData from an ImageBitmap. Since an ArrayBuffer can be detached/neutered, this allows a clean way to reuse memory without dangling references.

Here's the proposed change in Web IDL. No strong opinion on any naming. The (perhaps poorly named) neuter option would call ImageBitmap.close before resolving the ImageBitmap.getImageData promise, with the idea being to avoid triggering GC as much as possible. Actual testing shows mixed (and possibly negligible) results, so it may be dropped as an option.

dictionary GetImageDataOptions {
  ArrayBuffer? buffer;
  boolean neuter = false;
}

partial interface ImageBitmap {
  [NewObject] Promise<ImageData> getImageData(int sx, int sy, int sw, int sh, optional GetImageDataOptions = {});
}

Benefits of this proposed change:

Stable memory footprint - reusing the memory from an existing ArrayBuffer allows developers to know the memory footprint of their code rather than leaving it to the fluctuations of GC, which as seen by the Firefox test results, may be inadequate.
Helps avoid memory fragmentation. Similar to the stable memory footprint, reusing memory will help avoid memory fragmentation which slows down performance. Fragmentation is prone to happen with these sort of large allocations of contiguous memory, as once it's freed, a small object may be allocated in the multi-megabyte chunk, preventing it from being reused, and instead requiring a new multi-megabyte chunk to be allocated. Allocation methods which use arenas can suffer from a 'high water mark', holding significant system memory with most of it being unused by the application, but unavailable to the system.
Since ImageBitmap is [Transferable], it can be efficiently offloaded to a web worker where the ImageData can be extracted and processing done off the main thread. Firefox currently has no way to perform this extraction of ImageData on a web worker due to not supporting 2d as a context for OffscreenCanvas.
Pairs efficiently with methods of getting an ImageBitmap for webcam data or video, such as createImageBitmap(videoElement) and ImageCapture.grabFrame.
Avoiding canvas removes potential for captured webcam frames from being written from CPU-to-GPU-to-CPU when accelerated canvases are in-use.
Returning a promise allows it to be used on the main thread without blocking, and opens the possibility to use multiple threads for particularly large resolutions.

Open questions:

How to deal with concurrency issues due to asynchronous API, when reusing an ArrayBuffer? I believe as long as the original ArrayBuffer is detached before promise is returned, then it should be no different than transferring an ArrayBuffer currently.
How to deal with concurrency issues when an async ImageBitmap.getImageData is in-progress and an ImageBitmap.close or transfer occurs? I would assume both issues also exist with using createImageBitmap(otherImageBitmap), although I don't see anything mentioned in the spec for concurrency issues here.

The repo I linked earlier in this post contains patch implementations for both Chromium and Firefox. They're incomplete and were done just for testing the 'go right' path: they don't have error handling, and they're not pretty (apologies to the devs for bastardizing their code). The Chromium patch shows 2-3x the performance of the current best method, achieving 30 FPS at 720p@30 and over 20 FPS at 1080p@30. In the latter case the bottleneck is elsewhere in the code. For Firefox there can be no performance comparison since it can't complete the testing without the patch to begin with.

Overall this seems like a low-effort to implement, with a large positive impact.

Jul 17 '19 22:07 dsanders11

I don't really understand the proposal (in particular it seems you're suggesting to use a detached ArrayBuffer somehow, but that's not possible). I recommend reading https://whatwg.org/faq#adding-new-features and perhaps starting a thread at https://discourse.wicg.io/ to refine this a bit.

Jul 19 '19 09:07 annevk

I don't really understand the proposal (in particular it seems you're suggesting to use a detached ArrayBuffer somehow, but that's not possible).

@annevk, I'm not suggesting using a detached ArrayBuffer, I'm suggesting detaching one.

let imageBitmap;  // Assume this is an ImageBitmap we want ImageData from

const oldImageData = new ImageData(1920, 1080);
const oldArrayBuffer = oldImageData.data.buffer;

const newImageData = await imageBitmap.getImageData(0, 0, 1920, 1080, { buffer: oldArrayBuffer });
const newArrayBuffer = newImageData.data.buffer;

// oldArrayBuffer is now detached - the memory it had now belongs to newArrayBuffer

That's an overly wordy example with the variable names, how it would actually be used with ImageCapture.grabFrame:

let imageBitmap;
let imageData = new ImageData(1920, 1080);

while (true) {
  imageBitmap = await imageCapturer.grabFrame();
  imageData = await imageBitmap.getImageData(0, 0, 1920, 1080, { buffer: imageData.data.buffer });

  // Process ImageData
}

Jul 19 '19 10:07 dsanders11

This would be useful also for cases where code wants to get raw decoded image data.

Currently constructed ImageBitmap is the only way to disable alpha premultiplication (well, aside from WebGL, but it's much more complicated).

However, there is currently no way to read raw pixels back from the canvas once the image bitmap has been transferred to it, because Canvas allows getting context only of a single type and getImageData lives only on 2D context today.

Apr 21 '20 17:04 RReverser

Referring to the closure of #4748
While I agree having this method on ImageBitmap is also a very good idea, I don't see why the CanvasImageData mixin's getImageData method shouldn't get extended too.
It's not rare at all to have to get updates every frame from the previous drawings, and having to allocate new memory every time also means more garbage to be collected every frame.
Unless I'm missing something, generating an ImageBitmap from a Canvas also requires allocating new memory, so for this case, this proposal doesn't add much.

Jul 03 '20 03:07 Kaiido

@Kaiido, thanks for taking a look at this.

While I agree having this method on ImageBitmap is also a very good idea, I don't see why the CanvasImageData mixin's getImageData method shouldn't get extended too.

Sure, there's plenty of benefit to it existing in both places. As I mentioned in the original comment ("in providing a new API it can have changes which would otherwise be breaking for CanvasRenderingContext2D.getImageData"), there's a downside on the existing CanvasRenderingContext2D.getImageData in that it doesn't return a promise, and changing that would be a breaking change to that API, so that's not going to happen. While this API doesn't have to return a promise, that allows more flexibility for high-performance implementations, allowing the cropping and copying of memory to not block the main thread.

So, yes, a synchronous version of this proposal could also be implemented with CanvasImageData.getImageData.

Unless I'm missing something, generating an ImageBitmap from a Canvas also requires allocating new memory, so for this case, this proposal doesn't add much.

I believe you're right, if you're solely concerned with ImageData from a canvas, then you're still ending up with an allocation. If you're concerned with video data though (which is my main motiviation with this proposal), you skip the canvas intermediary and use an ImageBitmap directly from the video. There's no way to avoid an allocation per video frame you want ImageData for without a much more significant change to APIs. The nice thing about ImageBitmap is with the close method you have direct control over when that memory is released and so you don't get garbage collection churn.

Oct 13 '20 10:10 dsanders11

I also had this same problem and the only way I found to get around this was through the service woker where for each request I filter until I get the image I want and use this code to binary opeter in the image

fetch('https://upload.wikimedia.org/wikipedia/commons/7/77/Delete_key1.jpg')
  .then(res => res.blob()) // Gets the response and returns it as a blob
  .then(async (blob) => {
    console.log(blob)
    console.log(await blob.text())
});

Jan 06 '21 19:01 jadsongmatos

I also had this same problem

Are you sure it was the same problem? I don't see how your fetch snippet is relevant to or helps with getting image data out of canvas context.

Jan 06 '21 23:01 RReverser

depends on the need can be exchanged blob.text () for blob.arrayBuffer () this code snippet shows how to get image data out of the context of the screen. but for this to work on the whole page I had to use service worker

Jan 07 '21 00:01 jadsongmatos

@Slender1808 Sorry, but it still looks unrelated. Your snippet doesn't do anything with screen or canvas, let alone ImageData, it just loads an image from a URL. Perhaps you misunderstood what this proposal is about?

Jan 07 '21 00:01 RReverser

sorry i thought i was trying to get data from an image without using canvas for that

Jan 07 '21 00:01 jadsongmatos

cc @whatwg/canvas

Apr 25 '21 05:04 annevk

I'm a bit torn by this asynchronicity thing.

On the one hand, I can see very well how having more async APIs is a good thing, and how we should aim at making most new APIs that could work in parallel Promise based.
On the other hand, as has been said in the initial comment, there are a few issues in this exact case which makes it very hard to make this API actually run "in parallel".

I believe that the slowest operation that will be done here will be the read-back of bitmaps living in the GPU*, and this can't be done asynchronously because indeed the ImageBitmap could be closed right after the call, synchronously discarding the bitmap before it's been read.

I fear this ends just like createImageBitmap which is actually synchronous with all but Blob sources.
This is in my opinion a problem because web-developers will certainly assume that since it's async, it's done in parallel and won't block the UI thread.
As an example, I saw people believing that using XHR.responseXML was better than using a DOMParser because XHR is async and they didn't realize that the parsing is actually done synchronously in the .responseXML getter.
I digress, but I guess you can imagine how bad it can be to make the users believe their code won't block when it will actually do, "Let's batch one call per pixel in a loop to get all pixel values, it's async anyway".

*(I could be wrong here, maybe color-space conversion or unmultiplying are also slow operations?)

Apr 27 '21 05:04 Kaiido

I believe that the slowest operation that will be done here will be the read-back of bitmaps living in the GPU*, and this can't be done asynchronously because indeed the ImageBitmap could be closed right after the call, synchronously discarding the bitmap before it's been read.

A new pending async readback would hold a strong ref to the resource, preventing actual closing of the resource. (this is the details behind the scenes, but this way close does precipitate release mostly-deterministically, unreliant on GC) (The alternative would be rejecting outstanding read promises on close(), but I don't think we want that)

*(I could be wrong here, maybe color-space conversion or unmultiplying are also slow operations?)

cpu-side colorspace conversion is slow (loooots of ops per pixel). Readback isn't really cpu-heavy per se, but rather potentially high-latency, as we have to wait for that resource to be "done". Otherwise it's just a copy or two, not slow per se.

Apr 27 '21 05:04 kdashg

As a user that reads images, reads their data (for elevation gis tiles), modifies their data (steganography), and needs to write it back out as a modified image (whew!) I REALLY need a lossless workflow.

ImageBitmap is at least hopeful with options to avoid alpha premultiply and color-space conversion.

For example, I currently have to use a webgl stunt to convert an image into reliabally lossless data. It gets weirder bundling the new modified data back to the file system.

Have pity on us image-as-data folks!!

May 10 '21 16:05 backspaces

I hope this gets the interest of implementers because it'd be nice to have this more direct access to ImageData.

Jul 26 '22 07:07 EFHIII

Ran into yet another project where I'd really want to decode image to raw RGBA data and remembered this issue. It would be really helpful to have getImageData on ImageBitmap itself, rather than going through the whole create canvas -> get 2D context -> draw image -> get image data back, both because it's slow & unergonomic, and because it doesn't actually preserve original RGBA values (due to premultiplication mentioned earlier).

Nov 05 '22 15:11 RReverser

Should be noted that WebCodecs do now offer a mean to get that data without going through a canvas. You can either create a VideoFrame from your ImageBitmap object, and then use its copyTo() method to get the raw data, or even use directly the ImageDecoder API to do the decoding.
This is a bit lower level than what I imagined an ImageBitmap#getImageData() would have been, since authors will have to handle the various formats the image can be stored in (I420, BGRA, etc.), but I believe this is actually a good thing, since this allows for a faster path, which is what I believe is asked for here.

Nov 06 '22 09:11 Kaiido

@Kaiido Thanks, I missed the ability to do this via new APIs (probably because, intuitively, VideoFrame seemed unrelated to just image decoding).

But yeah, while performance is one aspect of it, if it returns different formats, then it doesn't quite solve the usecase of just decoding image into RGBA data like we can with canvas. This step is still important / useful for graphic manipulation via external libraries, especially those compiled to Wasm.

Admittedly, users can compile format conversion to Wasm too, but, assuming I'm not alone in using graphic manipulation libraries in Wasm, it would be good to avoid that bloat and have the format handled by the decoding APIs (maybe as part of ImageDecoder options?). It seems like all the necessary bits and functionality are already there in browsers, just not quite exposed via the API.

Nov 06 '22 18:11 RReverser

https://github.com/w3c/webcodecs/issues/92 is about adding such an API, however the current status is that it's not really required because going through an ImageBitmap would return the data as RGB[A|X].

So if you want raw data as fast as possible, you take directly the data from the VideoFrame that either the MediaStreamTrackProcessor, VideoDecoder or ImageDecoder did produce, in whatever format it comes. And if you want RGBA data, you go through an ImageBitmap. For your case I'm not sure what would be the fastest between directly creating the ImageBitmap from a Blob, or going though an ImageDecoder + createImageBitmap(), but I suspect both would be faster than going through a canvas anyway.

However I don't see anything there enforcing that behavior, and indeed in my tests against current Chrome's implementation, it does correctly convert from NV12 to RGBA, and from I420 to RGBA, but it keeps BGRA as BGRA. Admittedly it's trivial to convert between these latter formats, but this tend to prove that authors can't really be sure the conversion will be done, and having to do even trivial conversion is cumbersome anyway. I'll comment there to see if that conversion can be enforced.

Nov 08 '22 01:11 Kaiido

it's not really required because going through an ImageBitmap would return the data as RGB[A|X]

And if you want RGBA data, you go through an ImageBitmap.

Not sure what you mean here by "going through an ImageBitmap". I mean, this issue is exactly about getting RGBA data (as Uint8Array) from ImageBitmap, which today is not possible without additionally going via Canvas -> 2d context -> drawImage -> getImageData.

Nov 08 '22 01:11 RReverser

Ah, I guess you're saying that creating VideoFrame from an existing ImageBitmap is supposed to convert it to RGBA (except for https://github.com/w3c/webcodecs/issues/92 issue), so then copyTo would give a way to retrieve that RGBA data as raw bytes?

If so, yeah, if that issue is resolved and formats are converted consistently, that could be a viable alternative.

Nov 08 '22 01:11 RReverser

Ah, I guess you're saying that creating VideoFrame from an existing ImageBitmap is supposed to convert it to RGBA (except for w3c/webcodecs#92 issue), so then copyTo would give a way to retrieve that RGBA data as raw bytes?

Yes that's what I mean.

Nov 08 '22 01:11 Kaiido

Thanks for the clarification. It's an interesting alternative.

It does seem a little worrying in terms of performance implications (will it always copy the image data to GPU and read it back?) but I agree it should be still faster than canvas... except that maybe canvas + willReadFrequently would be still faster?

Nov 08 '22 01:11 RReverser

I'm not sure if this is useful, but I've got a webgl function that turns an image into bytes with no distortions via alpha premultiply and color corrections. https://github.com/backspaces/agentscript/blob/master/src/RGBADataSet.js#L27

Nov 08 '22 15:11 backspaces

I'm not sure if this is useful, but I've got a webgl function that turns an image into bytes with no distortions via alpha premultiply and color corrections. backspaces/agentscript@master/src/RGBADataSet.js#L27

Have you managed to get this working using a HTMLVideoElement as a TexImageSource? Should be working according to the spec. I updated the snippet to use the webgl2 context, which should support it.

May 03 '23 11:05 ivancuric

html html copied to clipboard

[Proposal] ImageBitmap.getImageData

html
html copied to clipboard