exiv2 icon indicating copy to clipboard operation
exiv2 copied to clipboard

readMetadata - improve warning & error handling

Open moon6969 opened this issue 4 years ago • 5 comments

I propose readMetadata (or alternative) should:

  • Continue on error: return as much valid data as possible
  • return list of warnings: have a better framework for reporting warnings than stderr

Current implementation of ReadMetadata

(NB: I'm new to the mysteries of metadata and my C++ is quite rusty)
readMetadata is implemented per-image format (eg: in jpgimage.cpp, pngimage.cpp, etc.) In general readMetadata will loop through the image metadata sections and call decoders based on the 3 main metadata types: ExifParser::decode IptcParser::decode XmpParser::decode

Currently it's inconsistent - if errors occur during while reading metadata, the data might be

  • silently lost
  • lost with a stderr warning (#ifndef SUPPRESS_WARNINGS)
  • an exception raised

Once there is an error, often the other valid metadata in that image is not returned.

Continue on error requirement

  • Needs to be handled at multiple "stages" of the reading process:
    • Looping the metadata blobs - eg: failure to read XMP data should not prevent return of Exif data.
    • Looping the groups within a blob - eg: IPTC records, XMP namespaces
    • Looping the individual tags within a group
  • Should be consistent across all image formats and metadata types
  • if the error prevents further reading of that stage, at least return the valid data already gathered

Warnings requirement

  • part of the API rather than build time (#ifndef SUPPRESS_WARNINGS)
  • Not strings to stderr - perhaps an array of warning objects?
  • Warnings should indicate the stage/level etc. so calling app can figure out what was skipped or not EG: An invalid xmp namespace warning implies that namespace is skipped, but other XMP namespaces are OK (unless they also have a warning!)

Reason for proposal

I am writing a tool to re-format some collections of images while preserving as much metadata as possible. I believe following specifications accurately is important, so want to ensure any images I save are compliant, but I was also inspired by this note from https://dev.exiv2.org/issues/1154#note-4 ... My reasoning for being tolerant of data errors is that exiv2 is not a police force with the duty to enforce standards. Exiv2 exists to help users get at their metadata, whether it complies with the specs or not.

moon6969 avatar Nov 23 '20 13:11 moon6969

@moon6969 I admire your efforts to contribute to Exiv2. readMetadata() is the "beating heart" of Exiv2. I don't know that this is a suitable place to apply your "rusty" C++ skills.

None-the-less, your proposal is basically sound.

There is another significant matter related to your proposal. It's been a long standing ambition of the project to have a "unified metadata container". Currently we have three containers for Exif, XMP and IPTC. There is an implementation of the "unified metadata container" in the unstable branch on SVN. If we're going to perform brain surgery on readMetadata(), I'd like to see the "unified metadata container" included in that effort.

Another subject is who is going to undertake this effort? The team of active contributors is small. While a lot of work on 'master' has been done by Dan, Luis and Rosen, I'm not sure there is a plan for what will (and what will not) be in Exiv2 v0.28 when it eventually ships.

I maintain the 0.27-maintenance branch and will not be implementing your proposal in that branch.

I want to see Exiv2 survive after I retire in January 2021. I will finish my book "Image Metadata and Exiv2 Architecture" by the end of 2020. The purpose of the book is to document everything I know about both Metadata and Exiv2 and I hope the effort will be rewarded by seeing proposals such as yours come to life as Exiv2 thrives in years to come.

clanmills avatar Nov 23 '20 14:11 clanmills

I don't know that this is the best of place to apply your "rusty" C++ skills.

Totally agree! I don't have the skill-set or indeed time for such an undertaking, but can contribute to the discussion from my viewpoint as an API consumer.

moon6969 avatar Nov 23 '20 14:11 moon6969

Recruiting contributors to an open-source project is not easy. Many folks make suggestions about what I should do for them. Recruiting Engineers to work on the code, is more difficult.

dt941217dhc0

I discuss contributors in the book: https://clanmills.com/exiv2/book/#11-17

I want to acknowledge the hard work and effort by Andreas, Luis, Dan, Ben, Brad and Neils who have done a lot of great work on Exiv2.

clanmills avatar Nov 23 '20 15:11 clanmills

@moon6969 I hope I haven't insulted you with the Dilbert Cartoon. Good ideas are ... good ideas. Implementation is something else. The challenge of insufficient engineering resources is immense.

When I joined Adobe, their tag line was "Inspiration becomes Reality". In my mind, I changed it to "Inspiration becomes Perspiration!".

clanmills avatar Nov 23 '20 23:11 clanmills

I'm going to assign this to v1.00. One of the top priority matters for v1.00 is the Unified Metadata Container. It has been studied: #1505

Although I'm assigning this to Milestone: v1.00 as commitment to give this thought. We'll have to see what time contributors have available to actually work on this matter.

clanmills avatar Apr 12 '21 09:04 clanmills