pigallery2
pigallery2 copied to clipboard
Special characters displayed wrongly
Describe the bug
Danish letters Ææ, Øø and Åå and German ü - and probably other characters are not displayed correctly in the user interface when saved as "Region Person Display Name" or "Region Name", not sure which one is actually read. The metadata is added by DigiKam 8.1.0, but as far as I can gather, it is stored as UTF-8.
See screenshot and attached photo (borrowed from wikimedia)
exiftool output:
$ exiftool -codedcharacterset 2023-08-27-120000-example.jpg
Coded Character Set : UTF8
$ exiftool -Region* -Keywords -XP* 2023-08-27-120000-example.jpg
Region Person Display Name : Person Carrying ChildInYellowDress, Pærsøn Åkessün Æñtestå, RedHairedPerson SittingOnBench
Region Rectangle : 0.611242, 0.453747, 0.0214017, 0.0391972, 0.226011, 0.500157, 0.019285, 0.0319849, 0.365945, 0.486359, 0.00470367, 0.00940734
Region Applied To Dimensions H : 3189
Region Applied To Dimensions Unit: pixel
Region Applied To Dimensions W : 4252
Region Area H : 0.0391972, 0.0319849, 0.00940734
Region Area Unit : normalized, normalized, normalized
Region Area W : 0.0214017, 0.019285, 0.00470367
Region Area X : 0.621943, 0.235654, 0.368297
Region Area Y : 0.473346, 0.516149, 0.491063
Region Name : Person Carrying ChildInYellowDress, Pærsøn Åkessün Æñtestå, RedHairedPerson SittingOnBench
Region Type : Face, Face, Face
Keywords : Holiday, RedHairedPerson SittingOnBench, Person Carrying ChildInYellowDress, Pærsøn Åkessün Æñtestå
XP Keywords : Holiday;RedHairedPerson SittingOnBench;Person Carrying ChildInYellowDress;Pærsøn Åkessün Æñtestå
Photo/video (optional) that causes the bug
Screenshot
Note how the Keywords or XP Keywords are displayed correctly
Used app version:
- docker:latest
Did some further testing. Saved more person-metadata to the using "Tag That Photo". This makes the display correct in PiGallery.
Once the data is rewritten by exiftool, PiGallery displays it wrongly. This goes for both exiftool Windows executable and the ubuntu version under WSL
WSL (ubuntu)
$ cp 2023-08-27-120000-example-ttp.jpg 2023-08-27-120000-example-ttp-exifcopy.jpg
$ exiftool -all= -tagsfromfile @ -all:all -IPTC:All -XMP:All -ColorSpaceTags -F -codedcharacterset=utf8 2023-08-27-120000-example-ttp-exifcopy.jpg
cmd.exe (windows 10)
>copy 2023-08-27-120000-example-ttp.jpg 2023-08-27-120000-example-ttp-exifwincopy.jpg
>exiftool -all= -tagsfromfile @ -all:all -IPTC:All -XMP:All -ColorSpaceTags -F -codedcharacterset=utf8 2023-08-27-120000-example-ttp-exifwincopy.jpg
When sorting and comparing the exif data as displayed by exiftool, there are no differences.
This is confusing, because I think "Tag That Photo" uses exiftool under the hood
I had the chance to play around a bit.
Converting variable "name" in line 487 of MetaDataLoader.ts from Ascii to utf-8 at least seems to fix the problem when viewed in the log. Without this conversion the same wrong characters show up in the log, as show up in the UI https://github.com/bpatrik/pigallery2/blob/3489f1d55ad4b7a5e83149887c665f7a5beddef0/src/backend/model/fileaccess/MetadataLoader.ts#L487C18-L487C18
Logger.info(LOG_TAG, 'name: ' + name);
Logger.info(LOG_TAG, 'name converted from ascii to utf-8: ' + Buffer.from(name, 'ascii').toString('utf-8'));
Logger.info(LOG_TAG, 'name converted from ascii to utf-8 twice: ' + Buffer.from(Buffer.from(name, 'ascii').toString('utf-8'), 'ascii').toString('utf-8'));
the output is:
So it could be that the library that reads the metadata assumes that it is ascii-encoded, which is why the conversion works. According to https://exiftool.org/TagNames/MWG.html, the MWG group recommends ASCII, but exiftool uses UTF-8. This may be the cause of the assumed ASCII format.
Contrary to the EXIF specification, the MWG recommends that EXIF "ASCII" string values be stored as UTF-8. To honour this, the exiftool application sets the default internal EXIF string encoding to "UTF8" when the MWG module is loaded, but via the API this must be done manually by setting the CharsetEXIF option.
I'm not yet comfortable enough with the code to suggest a solution and create pull request with a correction, but wanted to share my findings.
Fixed with https://github.com/bpatrik/pigallery2/pull/826