lagrange icon indicating copy to clipboard operation
lagrange copied to clipboard

Add support for JPEGXL encoded images

Open shtrophic opened this issue 1 year ago • 10 comments

Hey there. This is a proof-of-concept for JPEGXL decoding of the image/jxl mimetype. I figured that if there is WEBP support, there should be JPEGXL as well. After all, indie web requires indie codecs, amirite?

Right now, this adds an (optional) dependency on libjxl. That is a BSD-3-clause license, so it should be compatible? If you think this is something you'd merge, I can have a look about android (are there build instructions available already?). Can't do iOS though since I don't have apple hardware.

shtrophic avatar Oct 20 '24 21:10 shtrophic

Thanks. This looks reasonable, although I recommend running Clang-Format on your additions using the included .clang-format file in the repository. I can merge this once you're happy with it.

When it comes to mobile, I've been building the image decoder libraries from source and using those in the build, so in theory little is needed to use this on Android and iOS except for some fiddling with the build environment. Of course, it would be nicer if the platform's native APIs were used, since they might be more efficient, but I don't think it's worth the implementation cost. Images are not Gemini's core use case.

skyjake avatar Oct 21 '24 10:10 skyjake

One more thing that might be interesting: since you track isPartial on media files, it would make sense to pass that down to makeTexture_GmImage and then to loadJxl_ since jxl supports progressive decoding. Though, then it would probably be necessary to keep the decoder state somewhere, for it to continue on data that comes in later. Not sure if that would be a sensible thing to do, but the possibility exists. Maybe for another PR.

When it comes to mobile, I've been building the image decoder libraries from source and using those in the build, so in theory little is needed to use this on Android and iOS except for some fiddling with the build environment.

So, will you figure this out then? Available Android build instructions would be nice though!

Of course, it would be nicer if the platform's native APIs were used.

I'm not sure if any android native implementations for jxl even exist though.

shtrophic avatar Oct 23 '24 21:10 shtrophic

Hey there again. I had some time on my hands and gave progressive decoding a shot. You can see it in action when receiving a jxl over a slow internet connection or decoding a rather large image.

If you think that this is too much for now / needs further testing, feel free to tell me; My suggestion for this PR would then be to revert faec6f6 and create another PR for that.

If not, there are a couple of concerns I have. There is now a global map to keep track of the iStatefulJxlDecoders, which is not thread-safe. I wasn't sure if your drawing code is multi threaded? If yes, this probably needs a mutex as well I think. Other than that, I chose the iGmLinkId to be the hashmap's keys. This assumes, that they stay consistent... Maybe there needs to be a more robust one?

For some jxl images, you can try gemini://sir-photch.xyz/jxltest.gmi.

shtrophic avatar Oct 25 '24 20:10 shtrophic

Nice! Progressive decoding sounds useful, although I'll have to review your code when I have some more time available. It's certainly not mandatory to have, though, if the complexity is too high.

I wasn't sure if your drawing code is multi threaded?

Anything related to drawing is done in the main thread. Other threads are used for network requests, subscription updates, and searching for navbar lookup terms, not for anything that touches the UI widgets.

skyjake avatar Oct 28 '24 07:10 skyjake

It's certainly not mandatory to have, though, if the complexity is too high.

Well as I see it there are two ways to do it:

  • Create-destroy decoders with every chunk of the image we receive. This is less complex, but also less efficient, since a JxlDecoder can keep it's progress when it has output an intermediate image, to then continue on any chunks that conclude the original stream
  • Create encoders upon first request, then keep them somewhere until we have fed them with all data, step by step, while we consume intermediate images to then receive the final image and free the decoder

I opted with the latter since it seemed like the proper way to me. The actual usage of the libjxl API does not get more complicated. And for larger images, keeping progress makes sense.

But as I have looked at the_Foundation now, I have probably misinterpreted on how to create a subclass of iMapNode. I'll see what you suggest in your review though before I continue hacking on this :)

shtrophic avatar Oct 28 '24 21:10 shtrophic

@Sir-Photch I reviewed the code and made some corrections. Please see the notes in 1c6a9b00ed0 about the changes. If you keep working on this, I recommend using the "work/jpegxl" branch as the basis.

The major issue that remains fixing here is that the Map containing the decoders is global while link IDs are specific to one GmDocument. I recommend moving the map to be owned by Media (which is owned by GmDocument).

When it comes to including this in prebuilt Lagrange binaries, I am worried about the size of the JXL library. It looks like the binaries would grow by several megabytes, which seems a lot for something that is needed very seldom. I may opt to not include this in the prebuilt binaries. People may of course enable JXL when building from source.

skyjake avatar Dec 14 '24 17:12 skyjake

I recommend moving the map to be owned by Media (which is owned by GmDocument).

Done, though I am not sure on how to use the_Foundation correctly... deinit_,delete_,collect_,clear_? There isn't much documentation on the usage of the Map API there as well.

It's probably not ready to ship yet because there is still some kind of race-condition or bug, but I am not sure whether or not this comes from my changes. You can reproduce by quickly clicking on two inline jxls to trigger two decodings at the same time, collapse them again, and so on. Best case, the images load, mediocre case, it displays Failed to load Image while still decoding in the background (but not displaying intermediate frames), worst case: SIGSEGV.

When it comes to including this in prebuilt Lagrange binaries, I am worried about the size of the JXL library. It looks like the binaries would grow by several megabytes, which seems a lot for something that is needed very seldom.

Makes sense, though, also when statically linking and stripping?

shtrophic avatar Dec 19 '24 18:12 shtrophic

Done, though I am not sure on how to use the_Foundation correctly... deinit_,delete_,collect_,clear_?

Some of the common conventions are explained here, in the repository README: https://git.skyjake.fi/skyjake/the_Foundation/

To elaborate on the ones you mentioned:

  • deinit methods free any resources owned by an object. The object's memory is NOT freed. (cf. C++ destructor method)
  • delete calls deinit and frees the object's memory, too. (cf. C++ delete)
  • collect puts a pointer to the object into the garbage collection queue, to be deleted later.
  • clear empties all elements/contents; the container remains usable for putting more elements in.

The first three methods are automatically generated by macros for a given type, so they are generally available for every declared type.

There isn't much documentation on the usage of the Map API there as well.

True. I would've used Hash here myself. (Hash is also used in many places in the app, while Map is not.) Map exists primarily for sorted keys, and that is not a requirement here. Switching to a Hash shouldn't be too difficult.

The memory model for Hash and Map is the same: the user is responsible for the element ownership, while Hash/Map just maintains a lookup mapping.

some kind of race-condition or bug

Hmm, but you are using separate decoders per link, and also in that test case of two images, isn't all the encoded data already in memory? I'll try to take a look in the debugger. Perhaps it is trying to show the image before decoding has completed, without the higher code levels realizing this.

when statically linking and stripping?

Remains to be seen... I tried compiling libjxl statically but for some reason the build failed (tried on macOS 15.2), lots of library not found for -lcrt0.o errors. There might be system libraries/APIs provided for decoding JPEG XL on macOS nowadays, but I didn't have time to look closer. I would prefer not to add much, if any, platform-dependent code for this, though.

skyjake avatar Dec 21 '24 05:12 skyjake

An important note about async decoding: all the previous decoders in the app are synchronous, so once all the data is available, the image is decoded and drawn on screen. The UI thread is blocked during the decoding (!).

The implicit assumption is that when all the data is available, the image can be immediately displayed on screen.

To properly handle an async decode, you'd have to draw the UI/document with some placeholder while decoding is still happening, and then when decoding is finished, you'd trigger a document relayout so it'll update all the inline images and refresh the UI. One thread-safe way to trigger the layout is postCommand_App("document.layout.changed redo:1"); that posts an SDL event that is then handled in the main thread as soon as possible.

I would be fine with synchronous jxl decoding, if the above starts feeling too cumbersome.

skyjake avatar Dec 21 '24 06:12 skyjake

So I think I have the stability issues ironed out.

The remaining issue is caused by the fact that the drawing code expects the decoding of every image to be finished once any image is finished downloading; In other words, if you click on two inline image links and one downloads and decodes faster than the other, the slower one gets displayed with a caption indicating that loading the image failed. I worked around this in 0c2c014, such that the download progress of any unfinished image is being kept displayed.

It is a little crooked however, since I wasn't sure where & how the drawing offsets are calculated. This is how it currently looks like:

https://github.com/user-attachments/assets/8d48d12c-63cc-49e6-be9b-2e76e3c87a1f

shtrophic avatar Dec 23 '24 16:12 shtrophic

This code has been merged to the dev branch. I've squashed your commits and refactored it a bit so there is a separate "jpegxl.c" where the JXL code lives.

skyjake avatar Aug 28 '25 15:08 skyjake