forked-daapd
forked-daapd copied to clipboard
character decoding problem in iTunes Library.xml ("combining diaeresis")
I've got some serious problems with non-ascii characters in filenames contained in iTunesLibrary.xml
Take the following example:
<key>Track ID</key><integer>123</integer>
<key>Name</key><string>Die Klügsten Männer Der Welt</string>
<key>Artist</key><string>Die Ärzte</string>
<key>Album</key><string>Geräusch - CD2</string>
<key>Location</key><string>file://localhost/Volumes/gemeinsam/Musik/iTunes/iTunes%20Music/Die%20A%CC%88rzte/Gera%CC%88usch%20-%20CD2/07%20Die%20Klu%CC%88gsten%20Ma%CC%88nner%20Der%20Welt.m4a</string>
Special characters like ä,ö,ü are encoded using "combining diaeresis" (%CC%88) and forked-daapd doesn't seem to honor this, so I don't have those in my playlist but just when it picks up the file in the filesystem.
Whenever %CC%88 is found, the previous character should be converted a -> ä o -> ö u -> ü A -> Ä O -> Ö U -> Ü
and "%C3%9F" shall become "ß" (without any attention to the previous character).
Thanks!
The evhttp_decode_uri function in use here - provided by libevent - or its GCD pendant http_decode_uri are fairly simple and yes I don't think they supports CC88 url decoding.
However never had issues with files generated by itunes as they always used a different form of unicode normalisation in the encoded url? And those work IIRC.
Anything that messes with the encoding of your itunes.xml? Does the source look the same?
I was looking at the raw xml as generated by iTunes. This may be related to either iTunes on MacOS or storing the library on an AFP share... But regardless of the reasons I guess it wouldnt hurt to decode those characters?!