[enhancement]: PNG encoding performance
Is there an existing issue for this?
- [X] I have searched the existing issues
Contact Details
No response
What should this feature add?
PNG encoding can contribute substantially to graph execution time, especially when the graph passes large images around. When a graph operates on a large image, each time it is saved, we have to wait for the image to be encoded before writing to disk.
Currently, we use PIL to encode PNGs. Invoke's default compression level is 1 - the lowest amount of compression. This is substantially faster than PIL's default compression level of 6, but large images can still take many seconds to encode. Setting the compression level to 0 disables compression, resulting in much faster encodes but much larger files.
Here are a few ideas to improve the situation:
-
Use a faster PNG encoder. For example, there are python bindings for fpnge and fpng, two very fast PNG encoders. These packages aren't published to pypi, but maybe we can install them from github.
cv2 is marginally faster than PIL. It's not clear if the gains would offset the additional time needed to covert images from RGB (PIL) to BGR (cv2) constantly.
-
Reduce the number of times we encode PNGs by flagging certain node image outputs as needing to stick around only while the graph executes. We could skip saving it to disk and instead cache it in memory. This would require some internal changes and the UX of workflows may be impacted, as we expect node outputs to be visible in the UI. I think we'd also need to think carefully about the invocation cache.
Perhaps we would flag certain nodes as saving their outputs, and only those are written to disk? We are currently kinda dancing around this with the intermediate image pattern.
-
In a similar vein to idea 2, we could encode the ephemeral images with a compression level of 0. These images would be erased when no longer needed (after graph execution?). This way we'd still have physical images, but no compression.
Alternatives
No response
Additional Content
Ref: #6594
If looking for a balanced result, it could be possible to output a png with compression of 0 and put the whole file into an lz4 or zstd compressor.
LZ4 & ZSTD are built for speed. LZ4 is meant to be faster than ZSTD, but sacrifices compression ratio (though it may not matter for pictures).
Ironically both lz4 & zstd may outperform png in both filesize and compression speed since PNG relies on zlib which is a very old algorithm.
Of course that means that saving/loading becomes a non-standard two step process (and output files are no longer readable by standard image viewers, which is useful for some users; though a utility tool could be provided to decompress the zstd into regular pngs):
Saving
- Save to PNG/BMP/whatever in memory
- Compress to zstd and save image.png.zstd (or image.bmp.zstd) to disk
Loading
- Decompress image.png.zstd into memory
- Load PNG/BMP/whatever from RAM
Btw https://www.lossless-benchmarks.com/ contains a benchmark all known lossless compression formats. The more on the left, the faster it is (it's time in seconds).
The higher it is, the worse is the compression. Thus you want to pick the format that is on the bottom left corner.
qoi seems to trade off compression ratio but it is consistently the fastest. jxl looks like compresses faster than png, but it is considerably slower at decompression.
And unlike my previous suggestion of using non standard formats, qoi is supported by many viewers.