image icon indicating copy to clipboard operation
image copied to clipboard

Async image decoding

Open samdenty opened this issue 4 years ago • 5 comments

I've searched previous issues, but couldn't find a mention of anything async.

Main reason for needing async is that decoding images from the network take quite some time. reqwest provides the bytes_stream method, but this returns tokio's AsyncRead trait.

From looking over the decoder it seems that'd it'd be quite difficult to switch, as all of the external image decoding libraries (png, gif...) only accept the sync-only Read trait.

Has async been considered / how difficult would it be to implement? or can streaming image decoding already be accomplished with reqwest

samdenty avatar Jan 03 '21 14:01 samdenty

You are correct that there is currently no direct async support in this crate. I'm not really all that familiar with the state of the various async I/O traits, so I can't say for sure how difficult it would be to add. From a quick search though, it looks like things are in flux, so perhaps we'd want to hold off until they stabilize?

I can't find any reference online, but is it perhaps possible to implement a wrapper struct that takes an AsyncRead and implements Read? The library shouldn't do any other blocking I/O so if you could get that to work, I would expect that to be usable.

fintelia avatar Jan 03 '21 16:01 fintelia

I can't find any reference online, but is it perhaps possible to implement a wrapper struct that takes an AsyncRead and implements Read?

No. async fn is not composable with regular IO in this way. The best approach for networking would be to download the whole resource before decoding. That might not even very inefficient unless you want to strongly utilize subsampling/interlacing for partial images. The best approach for avoiding duplicating all decoding methods would be IO-free approach where the caller progresses the underlying stream on demand (the Python ecosystem hit similar issues a few years back so you can read some justification here). That, too, would require some amount of rewrite though.

197g avatar Jan 03 '21 16:01 197g

I don't think async makes much sense here at all. Async usually happens when a resource can be processed as it's streamed in, and none of the underlying libraries support this, plus it doesn't provide all that much benefit.

dbrenot-pelmorex avatar Feb 28 '23 14:02 dbrenot-pelmorex

As a nuance to add, Tiff images are usually quite large (TB+/day) and used in satellite imagery processing. My use case is essentially, read the header and then byte index into the Tiff, retrieving a 256x256 image tile in the process. To do this, I'll likely end up forking the tiff library and adding support for AsyncRead + AsyncSeek instead of Read + Sync, as it's quite typical to consume imagery from remote S3 stores and then I can use Reqwest or something similar to do fun http range queries. I absolutely understand the purpose of the library isn't being too bloated and the tiff library is great, honestly, I got a lot further than I expected, but I think more traits can be useful to push the capabilities of the library. I would imagine reading remote streams would be a common problem across image formats, but appreciate most images can be just downloaded.

tristan-morris avatar Dec 04 '23 21:12 tristan-morris

@tristan-morris I'd be curious to hear why you can't do blocking HTTP requests and then implement the normal Read + Seek traits?

fintelia avatar Dec 05 '23 01:12 fintelia