lhotse icon indicating copy to clipboard operation
lhotse copied to clipboard

How should I extract the features of noisy speech after mixing?

Open huhuqwaszxedc opened this issue 1 year ago • 1 comments

Hello, My Cutset was obtained by Cutset.mix, so all of them are Mixcut. I used compute_and_store_features_batch function, the features of the output Cutset only contain the features of the first track (i.e. the source audio). If I want to obtain the features of noisy speech after mixing, how should I extract them? Thank you very much for your work!

huhuqwaszxedc avatar Nov 01 '23 09:11 huhuqwaszxedc

By default, it should already extract features for the "mixed" speech, not just the first track. The compute_and_store_features_batch calls load_audio internally which has mixed=True set by default (https://github.com/lhotse-speech/lhotse/blob/c5f26afd100885b86e4244eeb33ca1986f3fa923/lhotse/cut/mixed.py#L1027). If this is not the case for you, you may need to use pdb to step through the code and see where it is failing.

desh2608 avatar Nov 01 '23 13:11 desh2608