Pillow icon indicating copy to clipboard operation
Pillow copied to clipboard

Allow disabling default emission of JPEG APP0 and APP14 segments

Open bgilbert opened this issue 1 year ago • 9 comments

When embedding JPEGs into a container file format, it may be desirable to minimize JPEG metadata, since the container will include the pertinent details. By default, libjpeg emits a JFIF APP0 segment for JFIF-compatible colorspaces (grayscale or YCbCr) and Adobe APP14 otherwise. Add a no_default_app_segments option to disable these.

#4639 added code to force emission of the JFIF segment if the DPI is specified, even for JFIF-incompatible colorspaces. This seems inconsistent with the JFIF spec, but apparently other software does it too. no_default_app_segments does not disable this behavior, since it only happens when the application explicitly specifies the DPI.

bgilbert avatar Jan 02 '24 09:01 bgilbert

The setting could also be default_app_segments, defaulting to True.

bgilbert avatar Jan 02 '24 20:01 bgilbert

Recently, #7488 added "restart_marker_blocks" and "restart_marker_rows", and #7553 added "keep_rgb". Those were functional changes, whereas this is about saving a few bytes and grouping two libjpeg settings into one, which feels a bit more arbitrary (I'm afraid that a user will come along and say that they would like to set "write_JFIF_header" to false but not "write_Adobe_marker").

To try and prevent waking up one day and finding an overabundance of settings when saving JPEGs, and being unable to rearrange them without breaking backwards compatibility, can I ask - do you have a planned list of JPEG settings that you're thinking of adding in future PRs?

radarhere avatar Jan 02 '24 22:01 radarhere

This is the last JPEG setting I'm currently planning. I've been submitting them sequentially to avoid merge conflicts in PyImaging_JpegEncoderNew(), partly because I didn't understand how long review would take. (Not complaining. It's best to get API changes correct.)

This change indeed seems a bit more obscure than the other two, and I'm okay skipping it if desired. It's possible to remove the APP segments outside of Pillow by postprocessing the byte stream. (Assuming performance isn't a major concern, which is true for my use case.)

Alternatively, if the setting is too broad, I could split it into two tri-states such as write_jfif_segment and write_adobe_segment so the caller can force either segment on or off.

bgilbert avatar Jan 03 '24 01:01 bgilbert

Would it make sense to have one setting that takes a list of segment names to enable/disable? exclude_app_segments = ('APP0','APP14')

Yay295 avatar Jan 03 '24 02:01 Yay295

I believe JFIF and Adobe are the only APP segments supported natively by libjpeg, so those would be the only two supported values unless Pillow adds its own segments.

bgilbert avatar Jan 03 '24 02:01 bgilbert

Whoops, didn't notice the branch update before force-pushing. I've now rebased onto main. The net diff of the force-pushes is here.

bgilbert avatar Jan 07 '24 03:01 bgilbert

Done.

Before we get too far into detailed review, could I get some maintainer feedback about the high-level approach? Should I continue with no_default_app_segments, invert to default_app_segments, split into two options, drop the PR, something else?

bgilbert avatar Jan 08 '24 11:01 bgilbert

When embedding JPEGs into a container file format

Could you provide an example how to use no_default_app_segments to embed in container and which containers this could be?

homm avatar Jan 09 '24 10:01 homm

Something like:

start = fp.tell()
img.save(fp, format='JPEG', quality='web_medium', streamtype=2, no_default_app_segments=True)
end = fp.tell()

Container metadata, including JPEG tables, can then be written afterward as appropriate. My immediate use case is for a TIFF builder (I need to be able to write non-standards-conforming TIFFs), and DICOM datasets are also on my radar. Any file format designed to embed multiple JPEGs can potentially use this.

bgilbert avatar Jan 22 '24 17:01 bgilbert