DTLN Result of retrain is not good as pretrain models provided.

Thanks for your wonderful job. @breizhn
I use this project to retrain on DNS-challenge dataset that was updated recently. The denoised results of the retraining model is a little worse than that of your model provided in this project( both 40h and 500h model). I just set 'norm_stft'=True'. Any advice to improve the performance of retraining?

Looking forward to your reply.

Jan 23 '21 10:01 liziru

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Feb 12 '21 02:02 guojunmin

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Thanks for your reply. This training won't stop until the early-stopping callback patience is exhausted. And the final loss is close to -21.3. What about the training loss?

Feb 12 '21 04:02 liziru

Thanks for your wonderful job. @breizhn I use this project to retrain on DNS-challenge dataset that was updated recently. The denoised results of the retraining model is a little worse than that of your model provided in this project( both 40h and 500h model). I just set 'norm_stft'=True'. Any advice to improve the performance of retraining?

Looking forward to your reply.

Yes, I also realized that. Some of the additional data from ICASSP DNS-Challenge is very noisy (Exspecially the german data). The english speech of the first challenge has lot better quality. You could clean up the data by using the baseline model or one of the models provided here and discard all "clean" speech files with a bad SNR. Just process the clean files with a speech enhancement model (be sure that the model does not introduce any latency), subtract the enhanced speech from the input file. This gives you the residual noise. Compare the power of the residual noise and the enhanced speech. If the SNR is just 5 db, the file is probably not really a clean speech file.

Feb 12 '21 10:02 breizhn

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Thanks for your reply. This training won't stop until the early-stopping callback patience is exhausted. And the final loss is close to -21.3. What about the training loss?

@liziru What do you mean training loss? I just pay attention to val_loss displayed during training. How is your environment settings? created with train_env.yml? I was surprised by your final loss around -21.3, because I have tried a lot of parameter combinations and the best result is just close to -16.9. By the way, my training data is from the repo forked by the owner. Did you try this data before?

Feb 14 '21 18:02 guojunmin

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Thanks for your reply. This training won't stop until the early-stopping callback patience is exhausted. And the final loss is close to -21.3. What about the training loss?

@liziru How can you get -21.3 losses? My current parameters are, audio_length: 30 silence_length: 0.0 total_hours: 40 snr_lower: -10 snr_upper: 10 total_snrlevels: 5 Training Data: Validation Data = 7:3 Noise Type: 9 EA DTLN parameters are set as default parameters in git code.

Total Result : loss -11.7281, validation loss -11.0988...

Any advice would be appreciated....

Feb 17 '21 06:02 LeeGyuHa

Thanks for your wonderful job. @breizhn I use this project to retrain on DNS-challenge dataset that was updated recently. The denoised results of the retraining model is a little worse than that of your model provided in this project( both 40h and 500h model). I just set 'norm_stft'=True'. Any advice to improve the performance of retraining? Looking forward to your reply.

Yes, I also realized that. Some of the additional data from ICASSP DNS-Challenge is very noisy (Exspecially the german data). The english speech of the first challenge has lot better quality. You could clean up the data by using the baseline model or one of the models provided here and discard all "clean" speech files with a bad SNR. Just process the clean files with a speech enhancement model (be sure that the model does not introduce any latency), subtract the enhanced speech from the input file. This gives you the residual noise. Compare the power of the residual noise and the enhanced speech. If the SNR is just 5 db, the file is probably not really a clean speech file.

Thanks a lot. I will take a try.

Feb 19 '21 01:02 liziru

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Thanks for your reply. This training won't stop until the early-stopping callback patience is exhausted. And the final loss is close to -21.3. What about the training loss?

@liziru What do you mean training loss? I just pay attention to val_loss displayed during training. How is your environment settings? created with train_env.yml? I was surprised by your final loss around -21.3, because I have tried a lot of parameter combinations and the best result is just close to -16.9. By the way, my training data is from the repo forked by the owner. Did you try this data before?

I just follow the settings from repo and paper to train. And I found big data helps a lot. I did not use the training data referred to in the repo. Have you reviewed this paper?

Feb 19 '21 01:02 liziru

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Thanks for your reply. This training won't stop until the early-stopping callback patience is exhausted. And the final loss is close to -21.3. What about the training loss?

@liziru How can you get -21.3 losses? My current parameters are, audio_length: 30 silence_length: 0.0 total_hours: 40 snr_lower: -10 snr_upper: 10 total_snrlevels: 5 Training Data: Validation Data = 7:3 Noise Type: 9 EA DTLN parameters are set as default parameters in git code.

Total Result : loss -11.7281, validation loss -11.0988...

Any advice would be appreciated....

'total_snrlevels' should be 30. And you should review the paper for detailed training data settings.

Feb 19 '21 01:02 liziru

@liziru Have you finished all 200 epochs training? and what's the final val_loss?

Thanks for your reply. This training won't stop until the early-stopping callback patience is exhausted. And the final loss is close to -21.3. What about the training loss?

@liziru How can you get -21.3 losses? My current parameters are, audio_length: 30 silence_length: 0.0 total_hours: 40 snr_lower: -10 snr_upper: 10 total_snrlevels: 5 Training Data: Validation Data = 7:3 Noise Type: 9 EA DTLN parameters are set as default parameters in git code. Total Result : loss -11.7281, validation loss -11.0988... Any advice would be appreciated....

'total_snrlevels' should be 30. And you should review the paper for detailed training data settings.

Thank you. Thanks to this, Validation loss -16 came out. However, the performance is worse than model.h5 (the model in the code).

Mar 04 '21 05:03 LeeGyuHa

@liziru 您好，请问下你自己重新训练的模型大概是多少轮停的吖？我自己训练都是80多轮就停了，而且训练出来的模型参数大小是3990352，和预训练的参数量（norm 4003624 和 3989312）对不上啊

Jul 12 '21 01:07 ghost

@liziru 您好，请问下你自己重新训练的模型大概是多少轮停的吖？我自己训练都是80多轮就停了，而且训练出来的模型参数大小是3990352，和预训练的参数量（norm 4003624 和 3989312）对不上啊

可以对比看一下降噪效果

Jul 12 '21 02:07 liziru

@liziru 降噪效果从波形图上看比这预训练效果差些，参数大小就是模型最后训练完成保存的大小，训练数据也是按照-5 ~25 生成的

Jul 12 '21 02:07 ghost

@liziru 降噪效果从波形图上看比这预训练效果差些，参数大小就是模型最后训练完成保存的大小，训练数据也是按照-5 ~25 生成的

训练数据不完全一样。我的结果也会差一点。

Jul 12 '21 03:07 liziru

@liziru Hi, How can you get val_loss around -21? In my case I get train_loss 0.0011 val_loss:46 I didn't change a thing in this repository, or data configuration 500h (same as provided breizhn/dns-challenge) Can you give me some hint?

Nov 09 '22 08:11 jeungmin717

DTLN DTLN copied to clipboard

Result of retrain is not good as pretrain models provided.

DTLN
DTLN copied to clipboard