metadata-extractor icon indicating copy to clipboard operation
metadata-extractor copied to clipboard

Add raw value to Tag

Open FilippoVigani opened this issue 1 year ago • 5 comments

It would make sense to add the raw value to the class Tag on top of the already existing human-readable description, the same way that we have the tagType field which is technically the raw value of the human-readable tagName field.

FilippoVigani avatar Jan 18 '23 16:01 FilippoVigani

getObject should give you that, no?

https://github.com/drewnoakes/metadata-extractor/blob/5754a0d33659e6b1e9d8f35cf24bc03e0fbaf1b6/Source/com/drew/metadata/Directory.java#L1090-L1101

That API comment isn't great, looking at it again.

drewnoakes avatar Jan 23 '23 00:01 drewnoakes

That is kind of what I was looking for, I will admit it was hard to find at first. Maybe improving the documentation would be a good first step. One addition for the raw value would be to add support for retrieving the raw bytes instead of a class-specific object.

FilippoVigani avatar Jan 24 '23 13:01 FilippoVigani

We don't always hold on to the raw bytes for every tag. That would increase memory consumption.

Could you explain your use case?

drewnoakes avatar Jan 24 '23 13:01 drewnoakes

In my case I would like to have the raw bytes because for a forensic reporting tool they are necessary for re-parsing from third party tools without including the original file. So basically in the report include both human-readable formats and raw formats.

FilippoVigani avatar Jan 24 '23 13:01 FilippoVigani

There isn't always a 1:1 mapping between tag and byte(s). It'd be helpful to see a concrete example, with specific tags.

necessary for re-parsing from third party tools

Do you mean you need to extract only the metadata, persist it somewhere, then re-parse it later? Depending upon the format, you can do that. For example, take a look at JpegSegmentReader which will give you access to the different JPEG segments. You can then parse them individually at your leisure. What we don't have (and would be hard to add) is a way to map a tag to a specific byte segment.

drewnoakes avatar Jan 25 '23 01:01 drewnoakes