exiv2 icon indicating copy to clipboard operation
exiv2 copied to clipboard

Some easy way to access Exif metadata from root IFD and sub IFDs?

Open amyspark opened this issue 4 years ago • 8 comments

Hi!

I'm implementing support for TIFF metadata for Krita and I ran into the following.

Currently, we use Exiv2 to handle only metadata, specially because we can feed the blobs straight to ExifParser. This, however, does not apply to TIFF; libtiff doesn't let you access the available fields wholesale without breaking into its internal state, and exiv2 (to the best of my ability) only parses the first IFD.

Is there any way to tell exiv2 to target only a given IFD/offset, beyond the first one?

amyspark avatar Aug 23 '21 23:08 amyspark

Exiv2 parses all the metadata including the linked list of IFDs.

clanmills avatar Aug 24 '21 07:08 clanmills

Hi again,

I don't think this is correct. I've just created a 17-layer TIFF file in Krita (file here: 1879.zip), with a custom Make and Model on each IFD. As soon as Exiv2 gets to the fourth IFD, the library logs the following:

Warning: Directory Image3 has an unexpected next pointer; ignored.

and metadata for the fifth layer onwards is lost.

Nevertheless, Exiv2 is still unable to parse the whole metadata from massive TIFFs. Judging from this old Trac bug, Exiv2 will have trouble parsing a file with more than 21 layers in it, because:

https://github.com/Exiv2/exiv2/blob/c9f253f0e5674194045db8f78e53ce7e647b8a69/src/tags_int.cpp#L85-L86

it cannot generate group names for the 22th IFD onwards (among other holes in the referenced list).

amyspark avatar Aug 25 '21 15:08 amyspark

Please attach your file and it will be investigated.

clanmills avatar Aug 25 '21 17:08 clanmills

I've uploaded it in the comment above. Here's the link again: https://github.com/Exiv2/exiv2/files/7048090/1879.zip

amyspark avatar Aug 25 '21 17:08 amyspark

I'm not sure I understand what this file is. Preview.app on the Mac shows it as an 18 page file with every page identical. Exiv2 doesn't support multi-page tiff. There's usually a tag ( PageNumber 0x0129) in multi-page tiff. Am I correct in guessing that Trika considers each of those images to be a layer. Is that a feature of PhotoShop?

I'm not volunteering to do further work on this because I retired from Exiv2 in June 2021 (after 13 years). I'm happy to help Team Exiv2 when asked, however I'm not get involved in debugging or modifying the code. I've never investigated the ImageN metadata feature. It does seem to be limited to 10, so you assertion that it cannot handle more than 21 layers could be truet. Changing this could be difficult if it requires a modification to the architecture.

Curiously today @postscript-dev opened a discussion on the chat server about the SubImageN tags. https://matrix-client.matrix.org/_matrix/media/r0/thumbnail/matrix.org/xOrCGSnvCHcWmVnxSxaZIjqO?width=30&height=30&method=crop and https://github.com/postscript-dev/exiv2/discussions/4

I'm not sure why you're highlighted that code in tags_int.cpp. I believe that code causes the TiffParser state machine to create a CanonCs object when it encounters Canon tag 46 in their maker note. Your tiff doesn't have a Canon MarkerNote.

In discussing the tags with @postscript-dev, I modified a drawing in my book to help him understand group names. The drawing explains how the Nikon Picture Control Tag is revealed as Exiv2.NikonPc.Contrast etc. The state machine uses the following code in tags_int.cpp to create the tagListPc class:

{ nikonPcId,       "Makernote", "NikonPc",      Nikon3MakerNote::tagListPc     },

That object is defined (in nikonmn_int.cpp) as:

    // Nikon3 Picture Control Tag Info
    const TagInfo Nikon3MakerNote::tagInfoPc_[] = {
        TagInfo( 0, "Version", N_("Version"), N_("Version"), nikonPcId, makerTags, undefined, 4, printExifVersion),
        TagInfo( 4, "Name", N_("Name"), N_("Name"), nikonPcId, makerTags, asciiString, 20, printValue),
        TagInfo(24, "Base", N_("Base"), N_("Base"), nikonPcId, makerTags, asciiString, 20, printValue),
        TagInfo(48, "Adjust", N_("Adjust"), N_("Adjust"), nikonPcId, makerTags, unsignedByte, 1, EXV_PRINT_TAG(nikonAdjust)),
        TagInfo(49, "QuickAdjust", N_("Quick Adjust"), N_("Quick adjust"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        TagInfo(50, "Sharpness", N_("Sharpness"), N_("Sharpness"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        TagInfo(51, "Contrast", N_("Contrast"), N_("Contrast"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        TagInfo(52, "Brightness", N_("Brightness"), N_("Brightness"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        TagInfo(53, "Saturation", N_("Saturation"), N_("Saturation"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        TagInfo(54, "HueAdjustment", N_("Hue Adjustment"), N_("Hue adjustment"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        TagInfo(55, "FilterEffect", N_("Filter Effect"), N_("Filter effect"), nikonPcId, makerTags, unsignedByte, 1, EXV_PRINT_TAG(nikonFilterEffect)),
        TagInfo(56, "ToningEffect", N_("Toning Effect"), N_("Toning effect"), nikonPcId, makerTags, unsignedByte, 1, EXV_PRINT_TAG(nikonToningEffect)),
        TagInfo(57, "ToningSaturation", N_("Toning Saturation"), N_("Toning saturation"), nikonPcId, makerTags, unsignedByte, 1, printPictureControl),
        // End of list marker
        TagInfo(0xffff, "(UnknownNikonPcTag)", "(UnknownNikonPcTag)", N_("Unknown Nikon Picture Control Tag"), nikonPcId, makerTags, unsignedByte, 1, printValue)
    };

GkorgRxMKYUiAzEdukFfcmkQ

clanmills avatar Aug 25 '21 22:08 clanmills

Maybe this old post is of some use?

postscript-dev avatar Aug 26 '21 11:08 postscript-dev

Am I correct in guessing that Trika considers each of those images to be a layer. Is that a feature of PhotoShop?

Krita considers each individual IFD as a layer, yes. It's not a feature of Photoshop; Adobe uses a single IFD as the composited image, for fallback purposes, and instead stores the layer content as a PSD blob in tag 37724. (Incidentally, I wrote the r/w support for those in Krita a few weeks ago.)

I'm not sure why you're highlighted that code in tags_int.cpp. I believe that code causes the TiffParser state machine to create a CanonCs object when it encounters Canon tag 46 in their maker note. Your tiff doesn't have a Canon MarkerNote.

Correct, but the ExifKey code pairs the groupName with the IFD index:

https://github.com/Exiv2/exiv2/blob/2c57f214c561aaf8173a9fd098b623724f5d8d67/src/tags.cpp#L292-L295

https://github.com/Exiv2/exiv2/blob/d3e311fa624231dac37482eb3bb3bc472751c07c/src/tags_int.cpp#L2543-L2549

As 22 isn't a value stored in groupInfo (because of the hole I mentioned previously), it will return ifdIdNotSet, and that will later throw.

amyspark avatar Aug 26 '21 13:08 amyspark

@postscript-dev Well done to find that discussion #1124. I also thought about that.

@amyspark I misunderstood your original description. I normally use the term IFD0 and IFD1 to refer to the IFD arrays that are chained together in a single IFD. So when you said Exiv2 does not handle IFD1, I know that's not true because the thumbnail is stored there (see the attached drawing). We are discussing a single file which contains here are multiple IFDS.

This is an overflow. SubImageN spilled over into canonCsId (or something like that). The design subimageN isn't correct. A better way is to treat IfdId as a uint16-uint16 union, giving capacity for 65000 subimages.

In src/tags_int.hpp:

    //! Type to specify the IFD to which a metadata belongs
    enum IfdId {
        ifdIdNotSet,
        ifd0Id,
        ifd1Id,
        ifd2Id,
        ifd3Id,
        exifId,
        gpsId,
        iopId,
        mpfId,
        subImage1Id,
        subImage2Id,
        subImage3Id,
        subImage4Id,
        subImage5Id,
        subImage6Id,
        subImage7Id,
        subImage8Id,
        subImage9Id,
        subThumb1Id,
        panaRawId,
        mnId,
        canonId,
        canonAf2Id,
        canonAf3Id,
...

I suspect we can eliminate ifd1Id, ifd2Id which are a little suspicious if the chain of IFD arrays was more than 4. It's likely that IfdId should be a 32 bit quantity what is union uint16-uint8-uint8 in which the 16 bit quantity (normally zero) is the sequence of IFD in the file, and uint8 are the sequence in the IFD chain and the state (Ifd), subImage, canonId etc.

Do you feel up to the challenge of fixing that?

tiff

clanmills avatar Aug 26 '21 18:08 clanmills