Add decoding hooks
This is an attempt at implementing the format decoding hooks registry described in #2368. It works out to be less code than I was expecting.
At the moment the hooks aren't especially useful because the only image formats that hooks can be registered for already have built-in decoders, but that could be changed in followup PRs
We can kick the tires of this API by splitting PCX into a separate crate and using these hooks to wire it up.
Also, jxl-oxide has shipped image integration just days ago, and it would be a good test bed for a more modern format.
@HeroicKatora how would you feel about moving forward with this PR and then spinning out a single image-extras crate that uses hooks to support additional non-default format bindings, like the PCX decoder that was just written?
That sounds like a good way to dogfood the hooks.
IMHO there's also value in segregating the formats with weird licenses. PCX is under WTFPL, we probably don't want to have that as a dependency in the main crate. And all the RAW crates are under LGPL, it would be nice to signpost that as well. There's precedent in GStreamer, where decoders with weird licenses are segregated as gstreamer-plugins-ugly.
Although the PCX author is still around, let me ask them to relicense and see what comes of it.
pcx crate has been relicensed as WTFPL OR MIT OR Apache 2.0, so the license should no longer pose any issues.
The biggest concern being solved with an extra crate is the licensing / static dependency tree situation. However, I can also see users expecting formats to 'just be available' when depending on crates, similar to the discover mechanism of the traditional c libraries. We can't introduce life-before-main in Rust however.
This combination of requirements is not possible to resolve right now when moving to a separate crate, afaik. To automatically register the decoders from image-extra, I think we would need image to depend on image-extra (optionally). However first of all that is cyclic, so some common image-core containing the actual registry is required. Secondly, upstream needs to provide appropriate weak feature selections as appropriate image/image-extra must be explicitly enabled for it to be visible in cfg.
I think the hooks can address the needs for users to substitute some formats, but they do not appropriately solve our internal needs. And in this implementation, the substitution does not work in all situations as intended as the feature selection is always given priority.
On the plus side, splitting out image-extra would also let us break semver on those decoders without impacting image. I doubt that's going to be relevant to PCX though.
But now I think the licensing argument is not as strong as I made it out to be. AFAIU bundling a bunch of formats into -extra isn't really an improvement from the licensing standpoint, and the licensing motivation for PCX is gone regardless.
Addressing the use-case of substituting the builtin decoder with a personal one is quite valid. I'd be glad to approve on the PR and registry design if the registered hooks ran with higher priority than the builtin format dispatch. The trait boxing of Read etc. is neatly wrapped here. I don't think this competes with a design for image-extra. If that needs different priorities then we can write them; and the registration as in this API will be necessary anyhow.
If all you want is to substitute the decoder a known format with a custom implementation you don't need the hook machinery at all.
You can simply disable the support for this format in image and then handle the UnsupportedError with a precise ImageFormat and route it to a custom decoder for that format.
My top-level goal is finding a way to spend less time thinking about, reviewing, and supporting obscure formats.
IMO most of the effort around obscure formats is preparing/preventing them from causing issues for the rest of the crate. Any PR that touches the crate's Cargo.toml or lib.rs might accidentally (or maliciously) break them. Test images bloat the size of the repository. If some dependency yanks their crate, we may have to fork and publish our own copy. Some codec changes like swapping the backend decoder implementation might require a semver break.
Separating out less common formats into a different crate provides a much nicer boundary to be more lax about reviews and concerns like those. If a codec breaks, we can drop support in a new major version. If CI is broken, it doesn't block PRs to the main image crate. If a decoder uses unsafe code, it doesn't prevent the main crate from eventually having #![forbid(unsafe_code)]
I created https://github.com/image-rs/image-extras to test out the usage of this API.
In my view, there isn't much point in having hooks if they're going going to be for overriding the handling of built-in formats. That's because the ImageReader::make_decoder method that actually does the dispatch is quite short and could be replicated by user code. I'm also not aware of other Rust crates that allow overriding behavior for other API users.
Thus the question: Are we OK with merging this and then going ahead with publishing image-extras* to provide first-party support for less common formats?
*We've been saying both image-extra and image-extras in this thread. I'm fine with either, but if you have strong feelings, now is the time to say so!
One of the motivating use cases is Bevy where devs want to load custom formats through the standard asset loader.
https://docs.rs/bevy/latest/bevy/image/struct.ImageLoader.html
Haven't had much time to work on decoding hooks, but wanted to at least update where things stand...
While trying to use this API from image-extras I discovered that the requirement for the ImageFormat to be defined in this crate was really inconvenient. Adding new formats there required PRs across both repositories and waiting for new releases of image so they could be referenced there. Instead we'll probably want externally implemented formats to be registered by file extension and/or their magic bytes.
Ok, I've switched over the hooks to operate on file extensions rather than ImageFormat. There's now format detection hooks which take magic bytes and an optional mask, and map to file extension. I also deprecated the ImageFormat::Pcx type so that image-extras can implement it (built-in types are now never overridable).
I think it's great that hooks can be specified in terms of patterns now. But that is somehow at odds with use cases and evolvability.
That's because the ImageReader::make_decoder method that actually does the dispatch is quite short and could be replicated by user code. […] built-in types are now never overridable.
I really don't see the use case addressed here. If user code is being ran, one could as well construct the decoder by-hand directly. And with that change not even the provided format detection mechanism provides much of a help. Running of your code depends on a feature flag not being unified through third-party sibling dependencies. That seems … odd?
Meanwhile that other use-case is much clearer:
One of the motivating use cases is Bevy where devs want to load custom formats through the standard asset loader.
The standard path is something you can not currently customize; but Bevy wants to customize. In another use case we actually had they wanted to load lossless jpeg, a functionality we had to cut in image-jpeg -> zune_jpeg (well, and for it being buggy). Again this can only be customized if the hooks run before the builtins.
Also consider what happens if an extension crate (not image-extra) defines a hook for a format, and later we adopt that format, or just its detection, into image. Then the logic will switch from Format::Extension to Format::Builtin and the whole decoding changes in a potentially minor version bump. (Same situation if a sibling enables a new format feature flag). And there is no remedy that I see they could take apart from downgrading.
I think in total, hooks must override the default behavior. That's the only consistent way to address both the uses cases and the failure modes. (Of course the internal dispatch still requires a newtype like Format instead of happening on ImageFormat directly, file extension seems like a documentable way of handling that). Since siblings can't just run code, registration of hooks will retain its behavior in upgrade/feature flag scenarios and the same decoders will run. You just sometimes get less errors on previously unrecognized formats if they got built into image/image-extra for free.