liquidsoap
liquidsoap copied to clipboard
Random characters in place of apostrophe and other symbols in artist/title
Describe the bug In the artist and title of tracks, random characters often show up if the metadata contains apostrophe symbols, or acute symbols. For example, " & # 8 2 1 7 ; " is shown up instead of "'" (apostrophe) in media player devices (stream clients).
To Reproduce Seems to happen on random tracks, and editing the artist/title to remove the apostrophe (and replace it with one typed on the keyboard) as well as replacing the acute character with the regular letter (A instead of Á) fixes it.
Install method Opam, latest version.
Hi @KevanGP and thanks for reporting.
Would you be able to share a faulty file to [email protected] ? Thanks
Are you streaming to icecast? Maybe you need to set the mount charset to UTF-8 in your icecast config. I had the same problem a while ago
Basically adding this:
<mount type="default">
<charset>UTF-8</charset>
</mount>
To the icecast.xml config file did the trick.
I came across this issue and the solution above did fix it, however, I would like to suggest that since liquidsoap is managing the mount points dynamically, it should also send the icecast charset field as utf-8 for a given mount point if the liquidsoap icecast encoding field is set to utf-8. Without this, we have utf-8 encoded strings without utf-8 enabled mount points unless the icecast server config is modified for mounts at a global level as per above.
I came across this issue and the solution above did fix it, however, I would like to suggest that since liquidsoap is managing the mount points dynamically, it should also send the icecast charset field as utf-8 for a given mount point if the liquidsoap icecast encoding field is set to utf-8. Without this, we have utf-8 encoded strings without utf-8 enabled mount points unless the icecast server config is modified for mounts at a global level as per above.
Thanks for pointing this out. We are in fact sending the charset with our metadata update:
Cry.update_metadata
~charset:(Charset.to_string out_enc)
connection icy_meta
There could still be reasons for this to fail, like an invalid encoding name etc. Do you have more details about the issue so we can try to reproduce? What is the mountpoint configuration and the corresponding liquidsoap code?
I did notice in the code that does send encoding as part of the metadata update, though I'm thinking of the setting the encoding on mount point configuration or in this case dynamic mount point creation (which I don't even know if it's possible), because I was able to reproduce this again.
If it can't be set by liquidsoap by initiating a connection / creating a mountpoint, then perhaps we can update the documentation so users know they may have to update their icecast configuration separately.
--
If the charset is not set on the mount point, ie via the default / global way / xml, I'm not getting the right UTF-8 characters showing in the metadata output on multiple clients. This is also noted by LibraTime. (1) (2)
This is the charset setting I was looking at in the documentation (but of course changing it to UTF-8 in our case):
<mount type="normal">
<mount-name>/example-complex.ogg</mount-name>
<username>othersource</username>
<password>hackmemore</password>
...
<charset>ISO8859-1</charset>
...
charset
For non-Ogg streams like MP3, the metadata that is inserted into the stream often has no defined character set. We have traditionally assumed UTF8 as it allows for multiple language sets on the web pages and stream directory, however many source clients for MP3 type streams have assumed Latin1 (ISO 8859-1) or leave it to whatever character set is in use on the source client system.
This character mismatch has been known to cause a problem as the stats engine and stream directory servers want UTF8 so now we assume Latin1 for non-Ogg streams (to handle the common case) but you can specify an alternative character set with this option.
The source clients can also specify a charset= parameter to the metadata update URL if they so wish. ...
However, the current solution:
<mount type="default">
<charset>UTF-8</charset>
</mount>
uses this feature / global mount config:
type
The type of the mount point (default: ânormalâ). A mount of type âdefaultâ can be used to specify common values for multiple mountpoints.
I'm testing with this instance of icecast which simply substitutes some values based on environment variables on a vanilla config. If I extract their icecast.xml, add that default encoding config, rebuild and rerun the image, the utf-8 chars are shown as expected.
I'm outputting to icecast like this (setting encoding or not makes no difference):
output.icecast(
id="icecast-test",
%fdkaac(channels=2, samplerate=wavSampleRate, bitrate=icecastBitrate, afterburner=true, aot=icecastAacProfile, transmux="adts", sbr_mode=true),
send_icy_metadata=true,
#encoding="UTF-8",
host=icecastHost,
port=icecastPort,
password=icecastPass,
mount=icecastMount,
name="test",
public=false,
description="test",
testSource()
)
Testing with:
- Icecast 2.4.4 / https://github.com/infiniteproject/icecast
- Liquidsoap 2.2.5+git@b2194e147
- Multiple clients - XiiaLive Android, QMMP Linux
I did notice in the code that does send encoding as part of the metadata update, though I'm thinking of the setting the encoding on mount point configuration or in this case dynamic mount point creation (which I don't even know if it's possible), because I was able to reproduce this again.
If it can't be set by liquidsoap by initiating a connection / creating a mountpoint, then perhaps we can update the documentation so users know they may have to update their icecast configuration separately.
--
If the charset is not set on the mount point, ie via the default / global way / xml, I'm not getting the right UTF-8 characters showing in the metadata output on multiple clients. This is also noted by LibraTime. (1) (2)
This is the charset setting I was looking at in the documentation (but of course changing it to UTF-8 in our case):
<mount type="normal"> <mount-name>/example-complex.ogg</mount-name> <username>othersource</username> <password>hackmemore</password> ... <charset>ISO8859-1</charset> ...
charset
For non-Ogg streams like MP3, the metadata that is inserted into the stream often has no defined character set. We have traditionally assumed UTF8 as it allows for multiple language sets on the web pages and stream directory, however many source clients for MP3 type streams have assumed Latin1 (ISO 8859-1) or leave it to whatever character set is in use on the source client system.
This character mismatch has been known to cause a problem as the stats engine and stream directory servers want UTF8 so now we assume Latin1 for non-Ogg streams (to handle the common case) but you can specify an alternative character set with this option.
The source clients can also specify a charset= parameter to the metadata update URL if they so wish. ...
However, the current solution:
<mount type="default"> <charset>UTF-8</charset> </mount>
uses this feature / global mount config:
type
The type of the mount point (default: ânormalâ). A mount of type âdefaultâ can be used to specify common values for multiple mountpoints.
I'm testing with this instance of icecast which simply substitutes some values based on environment variables on a vanilla config. If I extract their icecast.xml, add that default encoding config, rebuild and rerun the image, the utf-8 chars are shown as expected.
I'm outputting to icecast like this (setting encoding or not makes no difference):
output.icecast( id="icecast-test", %fdkaac(channels=2, samplerate=wavSampleRate, bitrate=icecastBitrate, afterburner=true, aot=icecastAacProfile, transmux="adts", sbr_mode=true), send_icy_metadata=true, #encoding="UTF-8", host=icecastHost, port=icecastPort, password=icecastPass, mount=icecastMount, name="test", public=false, description="test", testSource() )
Testing with:
- Icecast 2.4.4 / https://github.com/infiniteproject/icecast
- Liquidsoap 2.2.5+git@b2194e1
- Multiple clients - XiiaLive Android, QMMP Linux
I was hoping icecast would do the character conversion when sending a metadata update in a different encoding than what the mountpoint is set to?
Anyhow, I'm not sure if there is much we can do:
- We can't know the mountpoint configured character encoding when connecting a source client.
- We can convert metadata update but we need to be told so by the user.
Given these, I don't know if we can either issue a log warning or do anything better?
I'm gonna close this for now, feel free to reopen or follow-up if needed.