Yi-Hua Chiu

Results 12 comments of Yi-Hua Chiu

To use `rnnoise` datasets, we should `normalize the volume` and convert frame rate to `16000` manually, and many of rnnoise audio are almost no sound without normalizing volume. This `mix...

> @mychiux413 any idea how can be this done ? it should be online process ? you should prepare normalized noise files by yourself before training start. there is no...

I added `bin/normalize_noise_audio.py`, and did some modifications: 1. Removed `typing` for environment compatibility 2. Fixed pylint error, added warning message for ImportError of `tqdm` & `pydub`, because they are not...

@alokprasad You're right, in fact, all the augmented audio should be able to be reviewed in pipeline, even augment on spectrogram like pitch/tempo/mask..., or we would not have a concept...

@alokprasad I tried tf.print and listened the audio, it's really augmented, maybe my default parameters are too conservative (because some noise data are "speech noise", I don't know what would...

@alokprasad But while the `uniform()` pick a lower noise_ratio like -35 db, I will consider the noise approximately as none, so why should we have to keep "clean audio" for...

@alokprasad To prevent this, I just repeat the noise file to make the duration over than speech file, this might make some `continuous environment noise`(like street) audio have `discontinuous points`,...

@alokprasad * For current version, The parameters `--audio_aug_mix_noise_max_noise_db -5`, `--audio_aug_mix_noise_min_noise_db -35`, `--audio_aug_mix_noise_max_audio_db 5`, ` --audio_aug_mix_noise_min_audio_db -10` could get a good result for me. * Yes, use SNR should be more...

@alokprasad I'm still developing, there are still some issues now, please do not use the commit. @DanBmh I will make the arguments also accept csv files for cocktail party purpose....

Update, specify the dbfs and S/R to determine the balance of audio/noise, and support csv files for cocktail party purpose. * Now, we can select noise files by directory or...