snapcast icon indicating copy to clipboard operation
snapcast copied to clipboard

Snapclient sounds like /dev/urandom just got piped into my speaker

Open kerta1n opened this issue 8 months ago • 35 comments

Describe the bug On occasion, all Snapclients's FIFO pipe breaks and it sounds like static at max volume.

Steps to Reproduce

  1. No real method to predict this behavior, may be memory related, as FIFO pipe is mounted in ram (/tmp/librespot.pipe), unlikely though since memtest86 reported 0 errors

Environment details

  • OS: Armbian (Dietpi)
  • Snapcast version: 0.27.0
  • Installed from a package, self compiled: Docker image mazzolino:snapcast-client
  • NOTE: snapserver is librespot-snapserver on UNRAID 6.12.4 using docker compose plugin, BOTH snapclient and snapserver dockerfiles were updated and built with Snapcast 0.27.0, unlike what djmaze' repo reflects

log from Snapserver:

2023-10-21 22-06-54.841 [Info] (librespot_playback::player) <Not The Boy> (104947 ms) loaded
2023-10-21 22-06-54.842 [Info] (LibrespotStream) metadata: <Not The Boy>
2023-10-21 22-06-56.755 [Info] (librespot_playback::player) Loading <Yessirskiii> with Spotify URI <spotify:track:59JWp4PjZ9TRM8cmtaDYB1>
2023-10-21 22-06-56.987 [Info] (librespot_playback::player) <Yessirskiii> (219340 ms) loaded
2023-10-21 22-06-56.987 [Info] (LibrespotStream) metadata: <Yessirskiii>
2023-10-21 22-07-00.443 [Info] (PcmStream) State changed: Spotify, state: playing => idle
2023-10-21 22-07-00.443 [Info] (Server) onStateChanged (Spotify): idle

log from a snapclient:

2023-10-21 21-59-56.490 [Info] (Stream) pBuffer->full() && (abs(median_) > 2): 2726
2023-10-21 22-00-07.200 [Info] (Stream) pBuffer->full() && (abs(median_) > 2): 3043
2023-10-21 22-00-17.909 [Info] (Stream) pBuffer->full() && (abs(median_) > 2): 2645
2023-10-21 22-00-28.618 [Info] (Stream) pBuffer->full() && (abs(median_) > 2): 2770
2023-10-21 22-00-39.328 [Info] (Stream) pBuffer->full() && (abs(median_) > 2): 2778
2023-10-21 22-00-39.648 [Info] (Stream) Exception: Not enough frames available, requested frames: 969, available: 360
2023-10-21 22-00-39.649 [Info] (Alsa) Failed to get chunk
2023-10-21 22-00-42.827 [Warn] (Alsa) snd_pcm_avail_delay failed: Broken pipe (-32), avail: 969, delay: 2794, using snd_pcm_avail amd snd_pcm_delay.
2023-10-21 22-00-42.830 [Warn] (Alsa) snd_pcm_delay failed: Broken pipe (-32), avail: -32, delay: 2794
2023-10-21 22-00-55.050 [Info] (Stream) pShortBuffer->full() && (abs(shortMedian_) > 5): 5220
2023-10-21 22-01-07.025 [Info] (Stream) pShortBuffer->full() && (abs(shortMedian_) > 5): 5019
2023-10-21 22-01-19.050 [Info] (Stream) pShortBuffer->full() && (abs(shortMedian_) > 5): 5001

kerta1n avatar Oct 21 '23 22:10 kerta1n

I think the audio output from librespot is taken from stdout and it seems that librespot outputs other data as well to stdout, like a log line, which would cause exactly what you have.

badaix avatar Oct 23 '23 13:10 badaix

Sounds good. Is there any way to change librespot to output to a Pipe instead of STDOUT? And if so, does this mean that I need to find a Librespot container that does so? Or is there any other way I should go about fixing this

kerta1n avatar Oct 23 '23 20:10 kerta1n

Are you running the latest version of librespot. If so, this sounds like a bug to be fixed in librespot and should be reported to them.

kingosticks avatar Oct 23 '23 21:10 kingosticks

Yeah, I think 0.4.1 from my modified image, but that also seems to be the default version in the image from djmaze. I'll gather more info later, on the road for another few hours

kerta1n avatar Oct 23 '23 21:10 kerta1n

Don't know if this will work, but you could try to start librespot in a terminal with the --backend pipe option and pipe the output to strings. If there are debug logs written to stdout, it should show up in your terminal: librespot --backend pipe|strings

badaix avatar Oct 24 '23 13:10 badaix

The headline of this issue describes the problem of my snapcast installation at it's best. But I don't think, it's a snapclient problem. Stop playing music and to start playing noise until I stop playing on spotify occurs on several clients at the same time. After restart playing on spotify, it's playing music again. The problem occurs randomly. It's not predictable.

The attached file shows such an event. While playing music the state changes from 'playing' to 'idle' (timestamp: Oct 31 13:48:49) and back to 'playing' again (timestamp: Oct 31 13:48:50). After changing back to state 'playing' just noise is played until stopping the playing (timestamp: Oct 31 13:49:05).

I'm now using the newest version of snapcast (0.27.0 - debian bookworm testing). But this problem has happened since using 0.24.0).

debugging-sudden-noise.txt

schniefus avatar Oct 31 '23 13:10 schniefus

But I don't think, it's a snapclient problem

Great, then you are on the same page as everyone else. We are saying the problem might be back on the snapserver, in librespot's --backend pipe implementation. If you read the post before yours you'll see an experiment you could try to help narrow down the issue.

kingosticks avatar Oct 31 '23 13:10 kingosticks

Is there any way to change librespot to output to a Pipe instead of STDOUT

You can modify the code in snapcast where librespot is launched and use

--backend pipe --device your_pipe_name_here

kingosticks avatar Oct 31 '23 14:10 kingosticks

Hi,

If I can help : I'm also facing this issue.

  • snapserver v0.27.0
  • librespot: 0.4.2

Very annoying and scary because

  1. It plays garbage (sometimes pure garbage, sometimes garbage over the music still beeing played)
  2. The volume seems to increase at it's maximum, like if snapserver suddenly ignores the currently set volume.

It's quite easy to reproduce, I just have to play spotify (spotify -> librespot -> snapserver -> snapclient -> etc), then change the currently played track, OR move to a different time in the track. Doing this a tens times produces the bug.

When this occurs, I can see the two log lines :

  • snapserver[6030]: State changed: spotify_2, state: playing => idle
  • snapserver[6030]: onStateChanged (spotify_2): idle

This stops when I manually PAUSE the track with spotify, then RUN it again.

As suggested, I tried to change the librespot backend to pipe (&params=--backend%20pipe), and I still see the bug occur.

I also ran librespot alone with backend to pipe, not revelant since librespot detects that there is not audio sink then stops.

cdtx avatar Nov 12 '23 00:11 cdtx

As suggested, I tried to change the librespot backend to pipe (&params=--backend%20pipe), and I still see the bug occur.

My suggestion involved more than that. Your change doesn't do anything.

kingosticks avatar Nov 12 '23 08:11 kingosticks

As suggested, I tried to change the librespot backend to pipe (&params=--backend%20pipe), and I still see the bug occur.

My suggestion involved more than that. Your change doesn't do anything.

Yes that's something I understood after I posted ;-) See my last line

cdtx avatar Nov 12 '23 10:11 cdtx

As suggested, I tried to change the librespot backend to pipe (&params=--backend%20pipe), and I still see the bug occur.

My suggestion involved more than that. Your change doesn't do anything.

Yes that's something I understood after I posted ;-) See my last line

I don't understand. There's still nothing in your post where you did what I suggested.

Running librespot standalone is a good idea. Pipe the audio to VLC, or something, and try to reproduce with your recipe.

Do you have to do it exactly 10 times. Or did you mean 'up to 10 times'? What if you let two tracks play through and then switch between the same two tracks many times I.e. does it have to be different tracks to cause the problem?

kingosticks avatar Nov 12 '23 11:11 kingosticks

I did the test you proposed (btw not easy to find the good vlc option set to read from stdout at correct rate)

librespot -n "Librespot Speaker" --backend pipe | tee /tmp/file | cvlc -vvv --novideo -A alsa  --demux=rawaud  -

I did not see any issue after I ran and ran again my recipe. Sees there is nothing STDOUTed from librespot except the audio.

Could a log be emited by snapserver ?

Do you have to do it exactly 10 times. Or did you mean 'up to 10 times'? What if you let two tracks play through and then switch between the same two tracks many times I.e. does it have to be different tracks to cause the problem?

I meant 'up to 10' yes, sometimes the first jump in one and only one track is enough to produce the bug. I'm quite sure it does not depend on what track is currenty played, or how many tracks have been played before.

cdtx avatar Nov 14 '23 22:11 cdtx

I'm not sure if it's the same problem. But I also had these problems from time to time. What helped me was the following patch. Never had any problems since then:

diff --git a/src/main.rs b/src/main.rs
index 11574d9..f7f4430 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -65,7 +65,7 @@ fn setup_logging(quiet: bool, verbose: bool) {
         }
         Err(_) => {
             if verbose {
-                builder.parse_filters("libmdns=info,librespot=trace");
+                builder.parse_filters("libmdns=info,librespot=info");
             } else if quiet {
                 builder.parse_filters("libmdns=warn,librespot=warn");
             } else {

rage2dev avatar Nov 18 '23 14:11 rage2dev

I'm not sure if it's the same problem. But I also had these problems from time to time. What helped me was the following patch. Never had any problems since then:

diff --git a/src/main.rs b/src/main.rs
index 11574d9..f7f4430 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -65,7 +65,7 @@ fn setup_logging(quiet: bool, verbose: bool) {
         }
         Err(_) => {
             if verbose {
-                builder.parse_filters("libmdns=info,librespot=trace");
+                builder.parse_filters("libmdns=info,librespot=info");
             } else if quiet {
                 builder.parse_filters("libmdns=warn,librespot=warn");
             } else {

It's a patch regarding librespot, I guess. I think reducing the loglevel in the snapserver call to librespot or deactivating the logging of librespot can achieve the same ...

schniefus avatar Nov 18 '23 16:11 schniefus

Since the problem mentioned in the title occures to often, I changed my setup. Snapclient and snapserver were running on the same raspberry pi (3). At the same time a kodi was running on the pi. The pi was connected to the network via WiFi. A few days after my post above I changed the setup. I choose a dedicated pi4 as home for the snapserver, connected to the network over LAN. On the via WiFi connected pi only the snapclient is startetd. But beside the snapclient kodi is still running. After changing the setup, the problem hasn't occured yet.
The solution for the problem in my case seem to be, to run the snapserver and snapclient on different devices and to run the snapserver on a device connected to the network via LAN.

schniefus avatar Nov 18 '23 16:11 schniefus

I'm not sure if it's the same problem. But I also had these problems from time to time. What helped me was the following patch. Never had any problems since then:

diff --git a/src/main.rs b/src/main.rs
index 11574d9..f7f4430 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -65,7 +65,7 @@ fn setup_logging(quiet: bool, verbose: bool) {
         }
         Err(_) => {
             if verbose {
-                builder.parse_filters("libmdns=info,librespot=trace");
+                builder.parse_filters("libmdns=info,librespot=info");
             } else if quiet {
                 builder.parse_filters("libmdns=warn,librespot=warn");
             } else {

It's a patch regarding librespot, I guess. I think reducing the loglevel in the snapserver call to librespot or deactivating the logging of librespot can achieve the same ...

Yes, librespot has a "quiet" logging mode switch, although I don't know if that then breaks the metadata support.

Librespot is configured to log everything to stderr so it seems unlikely that changing the logging verbosity would help but worth a try.

kingosticks avatar Nov 18 '23 16:11 kingosticks

Thanks @rage2dev, however since guys at librespot confirmed (as well as a test I did) that nothing is emited on STDOUT, I doubt that adjusting the loglevel would fix my point.

@schniefus, interesting point you mention : the bug could be related to network performances. (I already have snapserver and snapclient on different pi, but everything uses wifi)

cdtx avatar Nov 18 '23 19:11 cdtx

The title is pure gold.

I have been living with these random blasts of sound for quite some time. They don't occur as often since I usually play background music, but the issue seems to occur sometimes when I play media, skip to the next track or just move through the current track, though as I said, it doesn't happen every time.

I have 3 snapclients (2x rpi wireless, 1x desktop wired) The snapserver is on Home Assistant, which is also wired, so I doubt these issues are related to network performance.

Did anyone else try custom compiling librespot with the trace->info change? Is that the only lead for now?

rwjack avatar Nov 24 '23 16:11 rwjack

I meant 'up to 10' yes, sometimes the first jump in one and only one track is enough to produce the bug. I'm quite sure it does not depend on what track is currenty played, or how many tracks have been played before.

Where I was going with this, is to try and determine if is's related to snapcast trying to use data that librespot hasn't provided yet (because it's still loading it). Normally librespot caches audio data, so the idea was if you let a few songs play through normally you could be sure all the data is cached and readily available. Then do your recipe of skipping between only those same few tracks and/or skipping within those tracks and see if you can trigger the issue. My guess is you won't be able to.

kingosticks avatar Nov 24 '23 16:11 kingosticks

I started to dig the code and TRACE logs. I think I understand we are facing a bug related to the way the LibrespotStream object goes back from pause.

snapserver detects an actual librespot pause when this latter stops emitting samples on it's stdout. (When we switch between tracks or inside a track, and librespot does not have enough samples quick enough, it self goes to pause, then back to running a few ms after)

I think that when librespot goes back to running and starts emiting samples again, something weird happens somewhere in the AsioStream class that make it play garbage.

note: A better connection could avoid librespot going to pause, thus hide the issue.

cdtx avatar Nov 24 '23 17:11 cdtx

Same issue. It seems to hit me on the first track after the system has been idle for an extended period of time like overnight.

Environment details

  • Server OS: Debian 12 Bookworm
  • Snapcast version: 0.26.0+dfsg1-1+b2
  • Installed from a official Debian repo using apt
  • Client OS: Raspbian 10 Buster (was running an old version of Moode and they are behind on upgrades and I haven't switched to raw Armbian)
  • Snapclient 0.15.0-1
  • Installed from official repo using apt

LindyBalboa avatar Dec 03 '23 15:12 LindyBalboa

This same problem has been showing up with varying frequency on both of my setups (for quite some time, with different versions of snapcast and librespot).

I modified snapcast to also tee the stdout from librespot to a file. Importing this file into Audacity shows no problems with the stream after this bug has been triggered.

While this noise is playing it's possible to reduce the volume in Spotify/librespot to around 10%, and the music comes back. Looking at the raw audio file with a hex editor shows that this volume level corresponds to one byte being 00/FF for all (16-bit) samples, while the full range is roughly utilised of the other byte. So I guess this suggests that the problem is with the byte order getting mixed up somehow.

Adding more debug logging to posix_stream.cpp, I noticed that there often are an uneven number of bytes read so far for the current chunk when librespot stops emitting data (for whatever reason – but mostly after skips/seeks with cache disabled). And this is when the problem occurs.

If I add this line to the top of the code block handling the no data from read() case the problem appears to get a lot more difficult to reproduce: len -= len % 2;

If I'm onto something here, similar adjustment might also be needed when resuming from playing silence. (And 4 would maybe be a more correct number if we also have a bug where the stereo channels gets swapped.)

janinge avatar Dec 04 '23 23:12 janinge

diff --git a/playback/src/audio_backend/pipe.rs b/playback/src/audio_backend/pipe.rs
index e0e8a77..0a0d152 100644
--- a/playback/src/audio_backend/pipe.rs
+++ b/playback/src/audio_backend/pipe.rs
@@ -94,12 +94,17 @@ impl Sink for StdoutSink {

 impl SinkAsBytes for StdoutSink {
     fn write_bytes(&mut self, data: &[u8]) -> SinkResult<()> {
-        self.output
+        let out = self.output
             .as_deref_mut()
-            .ok_or(StdoutError::NoOutput)?
-            .write_all(data)
+            .ok_or(StdoutError::NoOutput)?;
+
+
+        out.write_all(data)
             .map_err(StdoutError::OnWrite)?;

+        out.flush()
+            .map_err(StdoutError::FlushFailure)?;
+
         Ok(())
     }
 }

After applying this patch to librespot I have not been able to reproduce this bug. It appears that Stdout in Rust's std::io is line buffered internally, even if stdout itself is set to be unbuffered or fully-buffered. This is obviously not great in our use case, as it at a minimum introduces unnecessary latency and jitter.

I don't know how librespot behaves if asked to use a named pipe instead of outputting to stdout. If it avoids this buffering I guess it can be an option to update snapcast to use pipes instead, to get around this problem also for unpatched versions of librespot.

Snapcast should maybe also be updated to better handle cases where "silence is detected" when the stream is in the middle of a sample, if for no other reason than as a defensive programming measure. Although I'm not sure what would be the best way to go about doing this.

janinge avatar Dec 05 '23 14:12 janinge

After applying this patch to librespot I have not been able to reproduce this bug.

It now reappeared when librespot must have ran out of data in the middle of a song (?). Still haven't been able to trigger it from skips and seeks.

janinge avatar Dec 05 '23 15:12 janinge

So unfortunate but I have the same problem.

rbozan avatar Dec 11 '23 18:12 rbozan

diff --git a/playback/src/audio_backend/pipe.rs b/playback/src/audio_backend/pipe.rs
index e0e8a77..0a0d152 100644
--- a/playback/src/audio_backend/pipe.rs
+++ b/playback/src/audio_backend/pipe.rs
@@ -94,12 +94,17 @@ impl Sink for StdoutSink {

 impl SinkAsBytes for StdoutSink {
     fn write_bytes(&mut self, data: &[u8]) -> SinkResult<()> {
-        self.output
+        let out = self.output
             .as_deref_mut()
-            .ok_or(StdoutError::NoOutput)?
-            .write_all(data)
+            .ok_or(StdoutError::NoOutput)?;
+
+
+        out.write_all(data)
             .map_err(StdoutError::OnWrite)?;

+        out.flush()
+            .map_err(StdoutError::FlushFailure)?;
+
         Ok(())
     }
 }

Confirming this patch has worked for me to stop random noise on track skip

beyondwatts avatar Dec 22 '23 14:12 beyondwatts

Oddly, ever since this issue was opened, I haven't had this occur more than 2-3 times. Whereas it happened regularly during 2023.

On the other hand, completely irrelevant to the topic, I wish we just had a decent replacement for Spotify. All the issues I'm having with home music, Spotify seems to be the culprit behind micro changes in their API, which keep breaking things. Yes, I am aware of MPD and local audio files, but it's the recommendation algorithm I'm talking about...

rwjack avatar Dec 22 '23 14:12 rwjack

diff --git a/playback/src/audio_backend/pipe.rs b/playback/src/audio_backend/pipe.rs
index e0e8a77..0a0d152 100644
--- a/playback/src/audio_backend/pipe.rs
+++ b/playback/src/audio_backend/pipe.rs
@@ -94,12 +94,17 @@ impl Sink for StdoutSink {

 impl SinkAsBytes for StdoutSink {
     fn write_bytes(&mut self, data: &[u8]) -> SinkResult<()> {
-        self.output
+        let out = self.output
             .as_deref_mut()
-            .ok_or(StdoutError::NoOutput)?
-            .write_all(data)
+            .ok_or(StdoutError::NoOutput)?;
+
+
+        out.write_all(data)
             .map_err(StdoutError::OnWrite)?;

+        out.flush()
+            .map_err(StdoutError::FlushFailure)?;
+
         Ok(())
     }
 }

After applying this patch to librespot I have not been able to reproduce this bug. It appears that Stdout in Rust's std::io is line buffered internally, even if stdout itself is set to be unbuffered or fully-buffered. This is obviously not great in our use case, as it at a minimum introduces unnecessary latency and jitter.

I don't know how librespot behaves if asked to use a named pipe instead of outputting to stdout. If it avoids this buffering I guess it can be an option to update snapcast to use pipes instead, to get around this problem also for unpatched versions of librespot.

Snapcast should maybe also be updated to better handle cases where "silence is detected" when the stream is in the middle of a sample, if for no other reason than as a defensive programming measure. Although I'm not sure what would be the best way to go about doing this.

Can you please submit a PR to librespot? I can't think of an application where line buffered output is useful so seems reasonable to fix what appears to be a bug.

kingosticks avatar Dec 22 '23 16:12 kingosticks

Can you please submit a PR to librespot? I can't think of an application where line buffered output is useful so seems reasonable to fix what appears to be a bug.

Yes, I will take a look at it and submit a PR. I assume this isn't a problem when librespot is told to output to a named pipe instead of stdout, but that this write_bytes function is used for both. This new flush call would be redundant in that case, so a little more code might be needed to keep things performant.

I guess snapserver could be made more robust by managing a byte counter for the received stream, and then discard 1-3 bytes if the stream isn't aligned on a sample boundary when starting a new chunk. Is this something that could be wanted? Should I try to write some code also for this, test it and submit a PR?

The latest release of librespot (v0.4.2) was released one and a half years ago. I don't know how far away v0.5.0 is, but I suspect there will still be some time before a release is made where this bug has been patched.

janinge avatar Dec 22 '23 17:12 janinge