Tiff/Exif: Parse RAW images and improve parsing
Current status:
Found full-sized JPEG preview and Nikon Compressed RAW image in SubIFDs for NEF. Added temporary commit to confirm existence of JPEG at expected position in file. To see how to show those in output using a method that will work for all Tiff-based files.
Known issues:
Format is not shown in MediaInfo output although it is filled for NEF. Format/Info shows without issues.
Thanks. Keep the commit, I want to see every image in the files.
I'll check for "Format".
Another thing is there can be 'unlimited' SubIFDs so I just add support for (a randomly chosen number) 3 SubIFDs. So far I have seen files with up to 2. Well the IFD 'chains' can be 'unlimited' too and we only handle IFD0 and IFD1. The 'chains' and 'tree' topology can technically be combined in any style as well.
I just add support for (a randomly chosen number) 3 SubIFDs.
I am fine with that, no need to manage non existant files, we'll see at long term...
There is an issue with Compression 6 and 7 JPEG. The names are different in Tiff and Exif codes. The handling may also be wrong for Thumbnail parsing (can JPEG parser handle the strips?). The 7 one is stored as strips and can be found in Sony ARW. Sony ARW seems complex (where is the actual RAW data?) but the first image is a JPEG thumbnail (type 6) which is not parsed.
Probably need to refactor the Tiff.cpp to loop through main and sub IFDs, detect the compression types and parse/fill accordingly.
The handling may also be wrong for Thumbnail parsing (can JPEG parser handle the strips?).
We don't parse this part so it should ne be impacting.
Sony ARW seems complex (where is the actual RAW data?) but the first image is a JPEG thumbnail (type 6) which is not parsed.
Not parsed? Isn't it just TIFF compatible so displayed as the first image? Sample file?
Probably need to refactor the Tiff.cpp to loop through main and sub IFDs, detect the compression types and parse/fill accordingly.
If too difficult for you, let me know ( with a file if it is specific) and I look at it.
Not parsed? Isn't it just TIFF compatible so displayed as the first image?
Not passed to JPEG parser so only has this:
Image #1
Type : Thumbnail
Format : JPEG (TIFF v6.0)
Format settings : Little
Density : 350 dpi
Sample file?
I just randomly got one from dpreview. I read that Nikon has a new RAW format that uses JPEG thumbnail too.
Not passed to JPEG parser so only has this:
Ha! Doable :).
I just randomly got one from dpreview
If you have a link or attach here :-p.
If you have a link or attach here :-p.
Sony ARW: https://www.dpreview.com/sample-galleries/5968723450/sony-zv-e10-ii-sample-gallery/9709838505
Nikon NRW: https://www.dpreview.com/sample-galleries/8732294445/nikon-coolpix-p950-sample-gallery/7330085905
Will it be a good idea to handle the SubIFDs in the same way as IFD0 in Tiff.cpp? So that regardless of if they contain RAW or JPEG, how many of them are present or if they are thumbnail or not, they can be displayed. Then it'll probably work for most RAW files.
Will it be a good idea to handle the SubIFDs in the same way as IFD0 in Tiff.cpp?
It may be worth a try.
@cjee21 in practice I rely a lot on your tentatives and I adapt the code when you are blocked, so if you think that it is a good option, please try and then I adapt if it is needed. What is important to me first is to have a proof of concept then I adapt for more sustainability if needed.
It may be worth a try.
I shift the SubIFDs enum next to IFD0 for now, maybe can iterate.
Looks like it works... at least for Sony and Nikon RAWs that I tested. Now you can view all the images in Nikon NEF. I still don't know where is the Sony actual RAW data. You can add the JPEG parsing if you want, can see a few unparsed JPEG streams.
ARW and NEF has the same issue of not showing the Format but is showing Format/Info.
ARW has IFD2 so added that. There appears to be a crazy amount of private IFDs in ARW. The Makernote offset in ARW is also pointing to after the Makernote header identifier so the current codes does not detect it.
Added XMP, PS and IPTC parsing as well as added a few types and tags. XMP and IPTC is tested but Photoshop not tested as I have no Photoshop-generated TIFFs.
Sony ARW seems complex (where is the actual RAW data?)
Looks like the large type 7 JPEG (ISO) compression one is the actual RAW image. It is compressed with Lossless JPEG (1992) (source). Different model use different compression types, total 3 types for Sony.
Tiff/Exif: Update compression values Tiff/Exif: JPEG tags from specification Exif: IFD data type Tiff/Exif: Parse XMP Tiff/Exif: Parse PSIR Tiff/Exif: Parse IPTC-NAA
theses commits are easy to review and not impacting a lot the parsing of other files, please send them in a separate PR.
theses commits are easy to review and not impacting a lot the parsing of other files, please send them in a separate PR.
Sent in 2347 and now this PR is built on that one.
This PR should be able to display all images in most if not all DNG/Sony/Nikon RAW files already. The only thing remaining is if you want to parse the JPEG images using JPEG parser since some of the JPEG images do not have info in the IFD so some of the streams are quite empty without even image dimensions.
Canon RAW files are not handled. Canon has 3 generations of RAW files. One using proprietary format, one Tiff based and the newest is ISOBMFF / MP4 based.
if you want to parse the JPEG images using JPEG parser
I do but may be for later.
Canon RAW files are not handled. Canon has 3 generations of RAW files. One using proprietary format, one Tiff based and the newest is ISOBMFF / MP4 based.
Would be great to have them but not mandatory.
Removed the following from Exif.cpp as it causes Format to disappear and also conflicts with new type handling in Tiff.cpp.
case IFD0::SubfileType: {
if (Item.second.Read().To_int64u() & 1) {
Fill(Stream_Image, 0, Image_Type, "Thumbnail");
Clear(Stream_General, 0, General_Format);
}
Parameter = (size_t)-1;
break;
}
Nikon NEF filetype is now detected based on content. Still no idea how to detect ARW and NRW from content.
Still no idea how to detect ARW and NRW from content.
From https://exiftool.org/forum/index.php?topic=15091.0 it seems not easy to differentiate NEF and NRW, Ill try to give it a look later. I see "NRW " in Nikon quality tag, maybe a way to flag that?
Removed the following from Exif.cpp as it causes Format to disappear and also conflicts with new type handling in Tiff.cpp.
Not sure that this is the right thing to do, I'll check that when I manage deeper this PR.
Still no idea how to detect ARW and NRW from content.
NRW: From http://fileformats.archiveteam.org/wiki/Nikon "The NRW format is virtually identical to the NEF format, with a different compression and curves" so maybe checking the corresponding tags.
I see "NRW " in Nikon quality tag, maybe a way to flag that?
Well detecting using this appear to work (tested on one sample). But I don't know how reliable it is. When the RAW is exported as Tiff, the quality tag is still NRW but with one more space behind.
Updated NRW detection. Should be reliable enough now.
In order to avoid to let this PR open a too long time, I did some QA and added a commit for handling issues (not always related to this PR, just that I remarked the issues) I found + I reverted the TIFF "flavors" filling (NRW, NEF...) because it does not always match with the extensions of the files I have, so I prefer to keep "TIFF" until we find a better rule.