rnnoise icon indicating copy to clipboard operation
rnnoise copied to clipboard

Noise contributions contain very loud noise

Open codesoap opened this issue 11 months ago • 2 comments

I have taken a closer look at the noise contributions at media.xiph.org/rnnoise/rnnoise_contributions.tar.gz. With the help of sox I have skimmed through some of the loudest files and found many instances where the noise is so loud, that I find it unreasonable to expect an AI model to recognize voice next to it. Those are the most problematic files I've found:

  • 1506612372095-other.raw
  • 1506865846246-other.raw
  • 1506890776920-other.raw
  • 1506896387552-other.raw
  • 1506904933605-coffee.raw
  • 1506905761767-coffee.raw
  • 1506931866078-other.raw
  • 1506937851368-office.raw
  • 1506942115691-office.raw
  • 1507008551397-other.raw
  • 1507024121772-other.raw
  • 1507046472430-other.raw
  • 1507051246600-street.raw
  • 1507053038795-other.raw
  • 1507225021633-other.raw
  • 1507225705223-other.raw
  • 1507256882651-other.raw
  • 1507264564781-other.raw
  • 1507279040493-train.raw
  • 1507279110456-train.raw
  • 1507288337806-other.raw
  • 1506716634275-other.raw
  • 1507372594108-office.raw
  • 1508468651573-office.raw
  • 1508504834575-car.raw
  • 1508917528488-office.raw
  • 1509685708555-none.raw
  • 1509701170578-train.raw
  • 1511050964203-none.raw

I think removing those files from the dataset will improve the quality of the AI model.

There are many more files containing loud noise, but I've tried not to include files where a human could at least make out some voice next to the noise.

codesoap avatar Jan 11 '25 09:01 codesoap

I only listened to a few files but couldn't find anything horribly wrong with them. The volume is loud, but any mixing software should normalize the noise to obtain a target SNR, so louder noise shouldn't matter provided that it's representative of real noise. Or were some of the files clearly just broken?

jmvalin avatar Jan 22 '25 02:01 jmvalin

Fair enough. I wasn't aware that normalization happened before mixing with voice. In this case I, too, don't see a problem with most of the files I listed.

However, there are a few files that sound so distorted, that I think the microphone may have been faulty; do you think they should be excluded?:

  • 1506937851368-office.raw
  • 1507046472430-other.raw
  • 1507256882651-other.raw
  • 1507372594108-office.raw
  • 1508917528488-office.raw

Those three files contain someone actively blowing or burping into the microphone. I'm not sure if that's necessarily bad noise, but it is probably something that would not occur when someone is using a noise suppressor:

  • 1507008551397-other.raw
  • 1507225705223-other.raw
  • 1507225021633-other.raw

codesoap avatar Jan 23 '25 08:01 codesoap