Allow disabling default emission of JPEG APP0 and APP14 segments
When embedding JPEGs into a container file format, it may be desirable to minimize JPEG metadata, since the container will include the pertinent details. By default, libjpeg emits a JFIF APP0 segment for JFIF-compatible colorspaces (grayscale or YCbCr) and Adobe APP14 otherwise. Add a no_default_app_segments option to disable these.
#4639 added code to force emission of the JFIF segment if the DPI is specified, even for JFIF-incompatible colorspaces. This seems inconsistent with the JFIF spec, but apparently other software does it too. no_default_app_segments does not disable this behavior, since it only happens when the application explicitly specifies the DPI.
The setting could also be default_app_segments, defaulting to True.
Recently, #7488 added "restart_marker_blocks" and "restart_marker_rows", and #7553 added "keep_rgb". Those were functional changes, whereas this is about saving a few bytes and grouping two libjpeg settings into one, which feels a bit more arbitrary (I'm afraid that a user will come along and say that they would like to set "write_JFIF_header" to false but not "write_Adobe_marker").
To try and prevent waking up one day and finding an overabundance of settings when saving JPEGs, and being unable to rearrange them without breaking backwards compatibility, can I ask - do you have a planned list of JPEG settings that you're thinking of adding in future PRs?
This is the last JPEG setting I'm currently planning. I've been submitting them sequentially to avoid merge conflicts in PyImaging_JpegEncoderNew(), partly because I didn't understand how long review would take. (Not complaining. It's best to get API changes correct.)
This change indeed seems a bit more obscure than the other two, and I'm okay skipping it if desired. It's possible to remove the APP segments outside of Pillow by postprocessing the byte stream. (Assuming performance isn't a major concern, which is true for my use case.)
Alternatively, if the setting is too broad, I could split it into two tri-states such as write_jfif_segment and write_adobe_segment so the caller can force either segment on or off.
Would it make sense to have one setting that takes a list of segment names to enable/disable?
exclude_app_segments = ('APP0','APP14')
I believe JFIF and Adobe are the only APP segments supported natively by libjpeg, so those would be the only two supported values unless Pillow adds its own segments.
Whoops, didn't notice the branch update before force-pushing. I've now rebased onto main. The net diff of the force-pushes is here.
Done.
Before we get too far into detailed review, could I get some maintainer feedback about the high-level approach? Should I continue with no_default_app_segments, invert to default_app_segments, split into two options, drop the PR, something else?
When embedding JPEGs into a container file format
Could you provide an example how to use no_default_app_segments to embed in container and which containers this could be?
Something like:
start = fp.tell()
img.save(fp, format='JPEG', quality='web_medium', streamtype=2, no_default_app_segments=True)
end = fp.tell()
Container metadata, including JPEG tables, can then be written afterward as appropriate. My immediate use case is for a TIFF builder (I need to be able to write non-standards-conforming TIFFs), and DICOM datasets are also on my radar. Any file format designed to embed multiple JPEGs can potentially use this.