camilladsp icon indicating copy to clipboard operation
camilladsp copied to clipboard

Possible to create a pipeline that subtracts speaker out from mic in?

Open fquirin opened this issue 2 years ago • 4 comments

Hi Henrik,

I recently discovered this project and was asking myself if it could help me with a challenge I was facing in my open-source voice assistant project (SEPIA).

The DIY client of SEPIA is built to work as kind of a smart-speaker/smart-display with speakers and microphone very close together. There is a wake-word detection engine and it works pretty well ... until you start to play music. Now since output and input are both generated on the same device I'm thinking about mixing the signals to reduce music background on the microphone input. Few days ago I actually saw a sketch by Apple about the same topic (looking pretty fancy):

image

Do you think this is something one could try using CamillaDSP? Any feedback would be greatly appreciated :slightly_smiling_face:

Cu, Florian

P.S.: Is it possible we worked together at SLAC once or twice? :grin:

fquirin avatar Dec 06 '21 21:12 fquirin

Hi! This seems like an interesting application! I haven't considered using CamillaDSP for anything like this before. I think the main challenge will be how to record the sound being output to the speaker. Alsa doesn't have any easy way of recording the output of a device. There are ways, but it get's quite complicated: https://stackoverflow.com/questions/12984089/capture-playback-on-play-only-sound-card-with-alsa You also get the additional problem of needing to capture the microphone input at the same time, preferably with a fixed delay wrt the playback. I suppose the devices in question have a stereo input, would it be acceptable to use a single microphone and use the second channel to record the analog output to the speaker? Or perhaps use two microphones, one close to the speaker and one further away? That way you could record everything from a single capture device and have no trouble with sync.

The subtraction of the music signal should be possible to do in CamillaDSP. If needed (I guess yes) you can apply some filtering to the music and/or microphone signals before mixing them. Getting that right will probably take some work! Should be fun though :)

P.S.: Is it possible we worked together at SLAC once or twice? 😁

I'd say there is a quite high possibility for that 😀

HEnquist avatar Dec 07 '21 21:12 HEnquist

I think the main challenge will be how to record the sound being output to the speaker My very naive hope was that its actually possible to create a virtual device e.g. via Pulseaudio that mixes microphone and output ... but I haven't really thought this through I guess :see_no_evil:

I suppose the devices in question have a stereo input, would it be acceptable to use a single microphone and use the second channel to record the analog output to the speaker?

:thinking: that sounds like a very interesting approach. I'm using 2-MIC or 4-MIC HAT for the Raspberry Pi quite a lot and I think one could spare half of the mics!

The subtraction of the music signal should be possible to do in CamillaDSP. If needed (I guess yes) you can apply some filtering to the music and/or microphone signals before mixing them. Getting that right will probably take some work! Should be fun though :)

Sounds promising :grin: . I'll put it on my to-do list and probably start to play around with CamillaDSP after the next SEPIA update :-)

'd say there is a quite high possibility for that

It's crazy how small this world has become :star_struck:

I'll come back here soon to follow up on the topic ;-) cu!

fquirin avatar Dec 09 '21 10:12 fquirin

I think it requires very-low latency to do the subtraction correctly. A DSP might be necessary?

t123yh avatar Feb 28 '22 14:02 t123yh

I've been experimenting a lot the last weeks with Pulseaudio module-echo-cancel. First I thought it won't work because of too many artifacts and too few attenuation but then I managed to use the 'speex' engine of the module with very promising results! I've been using a Raspberry Pi HAT with microphone and audio-codec on-board so I think this is probably necessary to handle the delay between both properly. In theory there is some kind of drift compensation as well, so it might even work with 2 different sources (separated by hardware).

What is still bothering me is the limited control one has about the algorithm. Maybe one could improve on those algorithms or build a new Pulseaudio plugin with custom code similar to RNNoise LADSPA.

fquirin avatar Mar 01 '22 16:03 fquirin

Closed as part of spring cleaning

HEnquist avatar May 15 '23 12:05 HEnquist