DeepFilterNet icon indicating copy to clipboard operation
DeepFilterNet copied to clipboard

Refactor noise processing

Open danielhuang opened this issue 1 year ago • 4 comments

The current implementation (dropping samples and introducing a delay) tends to introduce a delay which doesn't get reduced even after CPU load is reduced and processing is able to keep up again. It also ends up causing unrelated processing (e.g. EasyEffects output processing) to stutter when this happens.

Instead of keeping track of a delay, the processing thread will read samples from a ring buffer, and samples only get dropped when the ring buffer gets full.

Also fixes a resource (background thread continues to poll for samples) leak.

danielhuang avatar Nov 02 '24 23:11 danielhuang

Mostly done; the calling thread for the plugin no longer blocks (EasyEffects expects the plugin run method to not take too long, otherwise other audio streams could stutter), and CPU load will no longer lead to audio stutters. Since there's no more spin loops, the processing thread will remain idle when there's no activity. Latency is also lower when there's no other CPU activity.

Code should be ready to use, just need some more testing.

danielhuang avatar Nov 10 '24 08:11 danielhuang

does not work with ffmpeg -i in.wav -af ladspa=file=libdeep_filter_ladspa:plugin=deep_filter_mono:sample_rate=48000:controls="40|-15|35|35|4|0.02" out.wav it produces only zero-value samples. The Rikorose branch is okay.

2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsifying node #49 "/df_convp/df_convp.4/Relu.low" Max
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsified node #49 "/df_convp/df_convp.4/Relu.low" Max with PulsingWrappingOp
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsifying node #50 "/Add_1" Add
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsified node #50 "/Add_1" Add with PulsingWrappingOp
2025-01-25T13:36:08.230Z | INFO |  df::tract | Init DF decoder
2025-01-25T13:36:08.251Z | INFO |  df::tract | Running with model type deepfilternet3 lookahead 0
2025-01-25T13:36:08.252Z | INFO |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Initialized plugin in 384.5ms
[ladspa/src/lib.rs:213:9] &channels = 1
2025-01-25T13:36:08.252Z | INFO |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | activate
[Parsed_ladspa_1 @ 0x78f618019a80] [debug] handles: 1
[auto_aresample_1 @ 0x78f618031c40] [SWR @ 0x78f618031d40] [debug] Using fltp internally between filters
[auto_aresample_1 @ 0x78f618031c40] [verbose] ch:1 chl:mono fmt:fltp r:48000Hz -> ch:1 chl:mono fmt:s16 r:48000Hz
2025-01-25T13:36:08.252Z | DEBUG |  df::tract | Loading model DeepFilterNet3_ll_onnx.tar.gz
[info] Output #0, wav, to 'dfn.wav':
[info]   Metadata:
[info]     ISFT            : Lavf61.9.106
[info]   Stream #0:0, 0, 1/48000: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, mono, s16, 768 kb/s
[info]     Metadata:
[info]       encoder         : Lavc61.31.101 pcm_s16le
[out#0/wav @ 0x587d9d287580] [verbose] Starting thread...
2025-01-25T13:36:08.252Z | WARN |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Processing thread is overloaded! Dropping frame
[ladspa/src/lib.rs:403:17] &e = "Full(..)"
[ladspa/src/lib.rs:404:17] self.hop_size = 480
2025-01-25T13:36:08.252Z | WARN |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Processing thread is overloaded! Dropping frame
[ladspa/src/lib.rs:403:17] &e = "Full(..)"
[ladspa/src/lib.rs:404:17] self.hop_size = 480

Safari77 avatar Jan 25 '25 13:01 Safari77

The Rikorose branch is okay.

The original implementation was designed for online (real-time) processing - it happens to work with FFmpeg since it only cares about the wall clock time. If processing was slower than real-time (filtering 1s of audio took longer than 1s), then it would gradually increase the delay, causing silent audio samples to be inserted into the output stream, only to eventually crash with "Processing too slow!". It worked on your computer since your hardware is fast enough.

My new implementation is also designed for online processing with programs such as EasyEffects, but would drop samples more aggressively (and insert silent samples) since EasyEffects would lag the entire audio stream (all inputs and outputs) if samples are not received on time.

The more correct solution for offline processing (using FFmpeg on files, for example) would be to not care about timing at all; it should instead block the thread for each incoming sample as needed. This second approach could also work for real-time processing, but would rely on the other program to handle the case if processing falls behind. EasyEffects doesn't do this, hence the first approach.

danielhuang avatar Feb 17 '25 01:02 danielhuang

im still getting stutters on output with your fork

Edit: nvm i did something wrong this does fix the stutters. its working perfectly

XSilverTH avatar Apr 04 '25 08:04 XSilverTH

I've also gotten stutters in testing this fork in the ladspa plugin virtual microphone. I'm running on a reasonable machine (M1 series) so I would guess this shouldn't be happening?

Edit: Going from balanced to performance on Fedora resolves this... but with as lightweight as DeepFilter is I'm surprised this is an issue at all?

jacksongoode avatar Aug 03 '25 06:08 jacksongoode