TIFF/layered file formats support
This issue is a (very) loose checklist for TIFF image support. Feel free to add/modify/delete/flush down the drain.
- [ ] change our transparency detection logic to use ImageMagick’s saner
identify -format '%[opaque]'to be able to properly support layered files (see https://github.com/guardian/grid/issues/2395#issuecomment-939602668) - [ ] properly support transparent layered TIFFs (see this comment)
- [ ] improve creating prerenders: make them JPEGs for TIFFs without transparency (instead of current png8); leave them png8 only for transparent imagery
- [ ] make thumbnails of images with transparency transparent pngquanted png8
- [ ] [FEATURE] possibly collect extra information without explicit control over these elements (just info about them): layers, paths etc to add to the metadata from the previous point
- [ ] [FEATURE] search by opaque/transparent
- [x] [FEATURE] cleanup and rationalise imaging-related metadata we create [fixed]
- [x] deal correctly with miscategorisation of DNGs as TIFs by ImageMagick (see here, here and, internal ticket, here) [fixed]
- [x] [FEATURE] search by filetype [done]
- [x] [BUG] batch downloads have a wrong extension [fixed]
- [x] since our libraries should themselves deal with different flavours of TIFF correctly, we can support most provided we channel transparent ones through current PNG route and flat ones through current JPEG route
- [x] new MIME type
- [x] adding TIFF directory to metadata
- [x] collecting colourModel information (bit depth, hasAlpha (transparency, really) etc.
As some of the above operations are currently format-specific, adding TIFF provides a good opportunity to generalise them and bring some order to the galaxy. Viewing with an untrained eye, adding TIFF should be an order of magnitude easier than adding already supported PNG.
For reference, the work done to add PNG support:
https://github.com/guardian/grid/pull/1848 https://github.com/guardian/grid/pull/1912 (if we wanted to optimise the tiffs for consumption) https://github.com/guardian/grid/pull/1959
Instead of my monumental #1959, I would add: https://github.com/guardian/grid/pull/2375 (transparency sniffing leaky, follows https://github.com/guardian/grid/pull/1876), tightened here: https://github.com/guardian/grid/pull/2473 https://github.com/guardian/grid/pull/1878 (transparency route) https://github.com/guardian/grid/pull/1866 (no transparency > JPEG route) https://github.com/guardian/grid/pull/1857 (format-specific metadata)
https://github.com/guardian/grid/pull/2568
since our libraries should themselves deal with different flavours of TIFF correctly
This was/is a tad optimistic on my part. Some problems to think over/deal with:
- ~GraphicsMagick cannot correctly deal with LAB and CMYK TIFF files. ImageMagick can.~ [done]
- there is a serious problem when dealing with layered TIFFs that contain transparency: if they do not have transparency saved in the composite layer¹, we would need to merge the individual layers ourselves²
¹ most unfortunately, and stupidly, a Photoshop default; also it’s not trivial to tell them apart, but I think there is a way [see next comment for an idea]
² non-simple layers (eg. adjustment layers) cannot be correctly merged using ImageMagick (or whatever really, like eg. libvips), we may be able to tell if the TIFF contains those problematic layers via Exiftool or even metadata-extrator itself, though, and disable support only for them [see here for an explanation of why this is hard, verging on impossible]
[EDIT: for updated logic look waaay down to the last comment] As a lot of the outstanding issues depend on improved transparency detection, here is my current understanding on how to be most robust (will help sort out cases described in the previous comment).
We should use IM’s identify -format '%[opaque]'. It returns a string of boolean(s), eg. TrueFalseTrue calculating if there is transparency (False) for every layer in the file. We need to count them for all file formats which support layers. I assume the first one always represents a composite “background” layer. For all JPEGs and PNGs and for flat TIFFs, there will be only one (eg. True), we should write it to our hasAlpha. (%[opaque]:True means hasAplha:False).
To correctly sense transparency for layered files, we need to look past that, though. If the count is >1, we could record a new property hasLayers under our (now badly named) fileMetadata/colourModelInformation.
So:
- If there is only one layer – do what we do to PNGs and JPEGs.
- If there is more:
-
If the first layer returns
False, treat image as transparent PNGs, use only that first layer (it’s a correct composite). Use it also to create a thumbnail. -
If the first layer returns
True, the proper crazy part comes in:- count the number of
Truefor all layers above the first, if the number is ≥ 1, treat image as JPEG (hasAplha:False), use only first layer (it’s a correct composite). Use it also to create a thumbnail. - if the number is
0we arrived at the problematic case described in the previous comment: we have a layered TIFF file, which should be transparent (hasAplha:True), but we do not have a composite layer with transparency (stupid Photoshop saving default), only a useless one with baked in white background.
- count the number of
-
We have two options: – disallow uploading these files, warning users that they should check Save transparency – look in metadata (² above) and attempt merging all the layers but the first one ourselves if we know that there are no “fancy” layers present that would result in the wrong merge result. If there are “fancy” layers – disallow uploading these files (and ask to resave).
Phew! I hope it makes sense. (Does it, though?)
TL;DR we may just disallow uploading pics with fancy layers when trasparency is not present in the composite, otherwise, this gets crazier… TBH, Adobe’s fault (case of TrueFalseFalseFalse and as many False and no True ever)..
Looking at ² above, there may not be an easy way of making sure the file contains only the simple layers that ImageMagick would be able to merge correctly. Neither Exiftool, nor metadata-extractor contain readily useful info here as far as I can see (or rather: Adobe doesn’t provide it). One would need to dump all Photoshop metadata via Exiftool and grep for the presence of any of those tags, then, any of layer effects and, probably, more.
Here is a part of results for exiftool -u -v1 command on a TIFF file with a Hue/Saturation Adjustment Layer (which ImageMagick would just ignore when merging):
| | + [Layers directory with 2 entries, 4273176 bytes]
| | | LayerCount = 2
| | | + [Layer 1 of 2]
| | | | LayerRectangles = 0 0 1115 1500
| | | | LayerBlendModes = mron
| | | | LayerOpacities = 255
| | | | LayerNames = Layer 0
| | | | LayerUnicodeNames = .Layer 0
| | | | Photoshop_Layers_lnsr = ryal
| | | | LayerIDs = 3
| | | | Photoshop_Layers_clbl = .
| | | | Photoshop_Layers_infx =
| | | | Photoshop_Layers_knko =
| | | | Photoshop_Layers_lspf =
| | | | Photoshop_Layers_lclr =
| | | | LayerModifyDates = .MIB8tsuc4...metadata..layerTimebuod..~..B.A
| | | | Photoshop_Layers_fxrp =
| | | + [Layer 2 of 2]
| | | | LayerRectangles = 0 0 0 0
| | | | LayerBlendModes = mron
| | | | LayerOpacities = 255
| | | | LayerNames = Hue/Saturation 1
| | | | Photoshop_Layers_hue2 =
| | | | Photoshop_Layers_hue2 = ..a..;.Y..-.-KiKi..............;.Y.
| | | | LayerUnicodeNames = .Hue/Saturation 1
| | | | Photoshop_Layers_lnsr = tnoc
| | | | LayerIDs = 4
| | | | Photoshop_Layers_clbl = .
| | | | Photoshop_Layers_infx =
| | | | Photoshop_Layers_knko =
| | | | Photoshop_Layers_lspf =
| | | | Photoshop_Layers_lclr =
| | | | LayerModifyDates = .MIB8tsuc4...metadata..layerTimebuod%.~..B.A
| | | | Photoshop_Layers_fxrp =
Notice Photoshop_Layers_hue2 entry indicating presence of a fancy layer that ImageMagick would just ignore on merge.
Save transparency option is recorded in metadata. Exiftool:
| + [Photoshop Document Data directory, 475128 bytes]
| | Photoshop_DocumentData_MTrn =
| | - Tag 'MTrn' (0 bytes)
Mtrn, Mt16, Mt32
Saving Merged Transparency Key is 'Mtrn', 'Mt16' or 'Mt32' . There is no data associated with these keys.
Just in case we would want to support PSDs too (should be easy after sorting out the above). Maximise compatibility option ensures a merged result is saved to the first [0] layer (otherwise, this layer consists of gibberish/warning). There is a way to tell if the option has been used via Image resource block 0x0421 | 1057 (exiftool’s HasRealMergedData = 1).
Sadly, this Mtrn visible in Exiftool for a TIFF with Save Transparency option ON isn’t there when I imported com.drew.metadata.photoshop.PhotoshopDirectory Looks like this directory only contains Image Resource Ids section, not the needed Layer and Mask Information Section. It looks like we will have to use exiftool :-(
"photoshop": {
"Resolution Info": "72x72 DPI",
"Layers Group Information": "0 0",
"Thumbnail Data": "JpegRGB, 64x64, Decomp 12288 bytes, 1572865 bpp, 1137 bytes",
"Alpha Channels": "12 84 114 97 110 115 112 97 114 101 110 99 121",
"Global Altitude": "30",
"URL List": "0",
"Color Transfer Functions": "[112 values]",
"Print Flags Information": "0 1 0 0 0 0 0 0 0 2",
"Grid and Guides Information": "0 0 0 1 0 0 2 64 0 0 2 64 0 0 0 0",
"Print Scale": "Centered, Scale 1.0",
"Global Angle": "90",
"Seed Number": "5",
"Alpha Identifiers": "0 0 0 0",
"Slices": "RGB 8bpc (0,0,64,64) 1 Slices",
"Print Flags": "0 0 0 0 0 0 0 0 1",
"Print Info 2": "[276 values]",
"Color Halftoning Information": "[72 values]",
"Unicode Alpha Names": "[30 values]",
"Version Info": "1 (Adobe Photoshop, Adobe Photoshop 2021) 1",
"Layer Selection IDs": "0 1 0 0 0 5",
"Caption Digest": "232 241 92 243 47 193 24 161 162 123 103 173 197 100 213 186",
"Background Color": "0 0 255 255 255 255 255 255 0 0",
"Display Info": "[17 values]",
"Layer State Information": "0 0",
"Print Style": "[557 values]",
"Layer Groups Enabled ID": "1",
"Pixel Aspect Ratio": "1.0"
[EDIT] From my huge testing corpus of one file saved three times, it looks like the presence of "Unicode Alpha Names" indicates Save Transparency checkbox was ON. One person on the internet agrees. Seems about as safe a bet as relying on PSD spec… ;-)
[EDIT 2] ~That person and me were wrong. Genuinely opaque TIFFs with alpha channel(s) also contain this.~ (unless some trickster named an alpha channel Transparency manually, this should be OK, if they did, one would need to grep (?) exiftool for MTrn, Mt16 or Mt32 to be really sure) Now off to learn how to make ImageMagick ignore alpha channels and not treat them as tiff:alpha: associated and use the first one to make holes in my image 🤦♂️
Loose Alpha channels (which Photoshop should, I think, mark as Unassociated, but instead marks as Unspecified) massively complicate all this. ImageMagick is blindly using first of those to burn holes in what Photoshop considers opaque pixels… After two days, I came up with this. It needs a lot of testing. This is so boring. I’m not sure Adobe can do file formats…

Because we are only ever using a merged layer ([0]), we can get rid of this code.
That still cannot tell between files with no alpha channels and at least one opaque non-bottommost layer from those with Save Transparency OFF
[EDIT: merging respects layer visibility, phew!] ~Just a note to self about further complication with figuring out that case: identify -format '%[opaque]' reports on visible and hidden layers the same. Output to PNG saves hidden out as well. So it’s impossible to merge all but …-0.png to check if there are any genuinely opaque layers (visible) in the file… Exiftool doesn’t seem to report layer visiblity either…~
