LSTM_PIT_Speech_Separation icon indicating copy to clipboard operation
LSTM_PIT_Speech_Separation copied to clipboard

Preprocessing of Dataset to feed into LSTM

Open divyeshrajpura4114 opened this issue 5 years ago • 3 comments

Can you please explain procedure or different steps to pre-process data before feed to LSTM. I am working on paper by Zhuo Chen on "Speaker-Independent Speech Separation With Deep Attractor Network", but I am not able to create batches because each audio file have different no of frames. So how do you handle variable length input to LSTM? I know techniques like padding sequence, but I dont think that would be effective because difference of no of frames is much large.

divyeshrajpura4114 avatar May 27 '19 12:05 divyeshrajpura4114

Hi, you can read those two files tfrecords_io.py and run_lstm.py.

aishoot avatar May 28 '19 08:05 aishoot

Ok. I will look into that. Thank You...

divyeshrajpura4114 avatar May 28 '19 09:05 divyeshrajpura4114

If we are able to create our own mixed wav files, then is there any need for getting SNR Signals of the Audio files?

nagasaibharath avatar Jun 13 '19 12:06 nagasaibharath