exiftool "ColorMode" and "ColorSpaceData" are passed through un-normalized, despite being invalid values
exiftool -j may extract "ColorMode" and "ColorSpaceData" values which are not fit to be placed into sys_file_metadata.color_space unaltered.
Example:
exiftool -j "exampe-sRGB-black-and-white-file.jpg" \
| jq '[.[] | {ColorMode, ColorSpaceData, ColorSpace}]'
[
{
"ColorMode": "Grayscale",
"ColorSpaceData": "GRAY",
"ColorSpace": "sRGB"
}
]
extractor defines color_space mapping as follows:
jq '.[] | select(.FAL == "color_space")' "extractor/Configuration/Services/ExifTool/default.json"
{
"FAL": "color_space",
"DATA": [
"ColorMode",
"ColorSpaceData",
"ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
]
}
Since \Causal\Extractor\Service\Extraction\AbstractExtractionService::remapServiceOutput breaks upon a non-null $value, the exiftool value "ColorMode" = "Grayscale" is extracted and passed back to the TYPO3 metadata extraction service, where it is used as parameter for an INSERT INTO sys_file_metadata.
However, the sys_file_metadata.color_space field is a VARCHAR(4), and "Grayscale" does not fit, causing an error in strict mode.
Furthermore, "Grayscale" is an invalid value. According to SYSEXT:filemetadata/Configuration/TCA/Overrides/sys_file_metadata.php, the correct grayscale color_space value would be "grey".
Thus, ColorMode = "Grayscale" and ColorSpaceData = "GRAY" must be normalized to the value "grey".
To my mind, this should be handled by the ColorSpace utility, using a configuration like ...:
{
"FAL": "color_space",
"DATA": [
"ColorMode->Causal\\Extractor\\Utility\\ColorSpace::normalize",
"ColorSpaceData->Causal\\Extractor\\Utility\\ColorSpace::normalize",
"ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
]
}
... and adjusting Causal\Extractor\Utility\ColorSpace::normalize to match on strings starting (lowercased) with "gray" or "grey" and replacing them with the canonical "grey" value.
Sounds legit, mind creating a PR?