CMGAN
CMGAN copied to clipboard
how to train on samplerate=44100
get error.
Run model on reference(ref) and degraded(deg) Sample rate (fs) - No default. Must select either 8000 or 16000. Note there is narrow band (nb) mode only when sampling rate is 8000Hz.
it seems like the model only can train on 8000hz or 16000hz?
get error.
Run model on reference(ref) and degraded(deg) Sample rate (fs) - No default. Must select either 8000 or 16000. Note there is narrow band (nb) mode only when sampling rate is 8000Hz.
it seems like the model only can train on 8000hz or 16000hz?
You can train the model on 44100kHz tracks, just comment out the assert line, but due to the self attention computation you need very large GPU memory to train.
how to inference large wav audio file?when I inference a 60s 44100hz audio file ,it cause too much gpu memory,then it stopped.I set cut_length=44100*1,do you have any idea to solve this problem? Thank you very much.@ruizhecao96
It should be ok only to infer only a one-second track of 44100kHz, I can infer10 seconds track of 16kHz on a 24GB GPU, how large memory is your GPU?
my gpu is 24G,I can infer maybe a at most 10s track of 44100.I make every 10s chunks,then conbine them ,but got some audio problems in the connection.if I want to infer 60s,what should i do,do you have any solution?Thank you very much. @ruizhecao96
I suggest to make the sample size of eahc batch dividable by 400, because the window length of stft is 400, this might solve the connection problem. e.g. for 441000 length track you can cut it to 440000 and reshape it to (11, 40000) and use it as input.