opus
opus copied to clipboard
Unexpected continuous noise when combining NoLACE with DTX
Are NoLACE and DTX supposed to be usable together? I have observed that when turning on NoLACE, as soon as the stream switches to DTX mode, the decoded 0/1-byte packets start generating some noise which sounds like it could be the tail of the last word that was said.
Here's an example of this noise (normalized to make it obvious): opus dtx+nolace.zip
I suppose we could simply mute the output of the decoder if we know we're in DTX mode, but this smells like a bug to me.
Thanks for reporting this @j-schultz. This looks indeed like a bug though it's more likely related to neural PLC (DTX is handled by the PLC module and NoLACE is not active in this case). I tried a few files myself but could not reproduce the issue. Could you share an input file that triggers it? It would also be interesting to know whether the problem is present with dec_complexity = 5 (i.e. neural PLC active and enhancement inactive).
It does happen with both decoding complexity 5 and 7. I'll see if I can get a minimal example put together - as we are streaming live audio with raw opus frames between clients, I'm not sure how comparable this is to using the file-based opus demo.
I also checked whether different encoding parameters could influence the result...
- Both application type
OPUS_APPLICATION_VOIPandOPUS_APPLICATION_AUDIOexpose the issue - Encoding complexity: Tried two different values (5 and 8), no difference
- Does not matter if inband FEC is enabled or not
Apart from that, we force a frame duration: 20ms and obviously DTX is enabled.
Thanks @j-schultz. In that case it's indeed rather neural PLC that's causing the issue (looping @jmvalin in). There could be many reasons for this to happen (DTX triggered during active speech, feature prediction going wrong in neural PLC, missing buffer update etc.) so it's crucial to find a file that triggers it.
Apart from this, we should probably revise DTX handling at the decoder in general. Handling it with neural PLC means that we run a relatively expensive neural vocoder to generate silence, which is quite wasteful. I will kick of this discussion in https://www.irccloud.com/irc/libera.chat/channel/opus
What you could try as a temporary fix is to set dec_complexity to 0 during DTX and back to 7 once the first active frame is received. That should solve the noise problem and would also save you some complexity.
Thanks for the suggestion, I applied the temporary workaround and that does seem to do the trick for now.
Actually I might have spoken too soon, while the (incorrect) work of the PLC can no longer be heard with this change, I still get some faint clicking sound every 400ms even though the source signal is 100% digital silence. So I think I'll wait for a proper fix before turning on NoLACE.
When in DTX mode, the encoder will send a "refresh" (or keepalive) packet every 400 ms to update the decoder noise estimate. Maybe that's what causing the issue. Are you also setting dec_complexity to 0 on that one?
For testing I set the complexity to 7 for every successfully received packet and to 0 for any missing packet. So the first packet of the DTX interval still has a complexity of 7. I will change this so that if the packet indicates the start of a DTX phase, it will already reduce the complexity to 0.
Edit: That did the trick.
Is there a file and exact command line I can use to reproduce the problem?
Here's a RAW sample file, together with the decoded result that I receive: sample.zip
Encoding command line: opus_demo.exe -e voip 48000 1 25000 -complexity 8 -dtx -framesize 20 withsilence.raw withsilence.opus
Decoding commandline: opus_demo.exe -d 48000 1 -dec_complexity 7 withsilence.opus withsilence.decoded.raw
Opus has been built with the following CMake configuration: cmake -DOPUS_BUILD_PROGRAMS=ON -DOPUS_DEEP_PLC=ON -DOPUS_DRED=ON -DOPUS_OSCE=ON -DOPUS_DNN=ON -DBUILD_SHARED_LIBS=OFF
Has same issue. Any updates?
@jmvalin Hello! Can you help with this issue? We updated to Opus 1.5 in WebRTC and also encountered this problem when enabling DTX. Could you possibly suggest where to look? I tried changing the complexity, but at some point the noise still appears and doesn't stop. please 🙏
So I think I was able to reproduce and it's indeed not what one would expect the decoder to do. Still the signal is very quiet so I'm still curious about how you came to notice the issue and what practical problem it creates.
It is very quiet indeed, but when using Opus in a video conferencing context, you would expect the audio output to remain 100% silent when noone is talking - so if small bits if everyone's last spoken word are lingering, it quickly starts feeling strange.
@jmvalin In video conferencing mode on a real iOS device (iPhone), this noise is quite noticeable to the ear :(