Clean-Samples
Clean-Samples copied to clipboard
Sample Quality, Normalization and Loudness
Thanks for the effort so far!
In ddbc883c324159e2591c4c580d487d68dc1152c1 the README states:
We recommend normalising them to xxx dB
I did some experiments with Dirt-Samples in the past and found that normalization is complicated. What comes to my mind is:
- Some drum machine samples have accent and normal level sounds – don't break the dynamic.
- If a loop is cut into slices – don't break the dynamic between slices.
- Short percussive samples and long pad sounds do not sound right together when normalized by peak, RMS or even EBU-R128.
So my suggestions to rephrase this, are:
- Sample true-peak MUST NOT exceed 0dBTP. EBU recommends -1dBTP at 4x-oversampling.
- Default sample loudness (not level) should mix musically well with audio program that is roughly according to EBU-R128. "Musically" means, those gabba samples are intended to be very loud; some whisper is intended to sound silent. The average non-percussive sample SHOULD be around -23dB RMS.
Probably this could form a new section in the README on "Sample Quality". Also with:
- Samples SHOULD not have DC-offset. Some kick-sounds natually have a non-zero mean, though.
- Samples MAY be ready-to-use bandpass filtered. Consider that playback speed might be altered.
What do you think?
This is good - some more detail on how this should look/waveforms and examples of some common good examples, or a reference set would be great in a README
Thanks @jkbd! I know very little about sample mastering really and this all sounds great.
I think we should be careful not to discourage contributions, so tackling this in two parts in the README could be a good idea. One bit that says more or less what it does now, and another 'advanced' bit with all this additional advice.
Personally I'm not super worried about this stuff, I guess loudness is just another property of a sample that I play with when improvising. Our aim isn't necessarily to compete with commercial software in making conventional sounding recordings for radio play. So the priority is to have a nice diverse set of sample banks that people can contribute to as easily as possible.
On the other hand this is all good advice, which we should try to follow when choosing sample packs for the 'default' superdirt samples etc. This sample mastering could happen at a later stage though.
I've updated the readme a bit - PRs very welcome!
Later I might provide a Python script that complains if a given *.wav violates the MUST-constraints. But until then, some more ideas:
- Noise floor SHOULD be below -70dB RMS – unless you really have to use that vintage tape recorder for creative reasons...
- Samples SHOULD be without noticeable reverberation – except e.g. that "gunshot in the woods", where the dry version would not be distinguishable and the version with artificial reverberation does not sound like the real thing.
- Stereo samples SHOULD be mono-compatible. So left and right channel mix without audible comb-filtering effects and the audio can be cut to vinyl.
- Stereo samples MUST NOT be Mid/Side-Stereo. Convert to Left/Right for convenience.
- Samplerates MUST be in the range of common audio hardware. So in 2021 44.1kHz or 48kHz times {1,2,4} is common. Be sure to watch Xiph.Org's Monty on Digital Audio before discussing samplerates. With creative use of replay rates, audio with high sample-rates can make a audible difference. Just be sure your software does not resample the buffer before usage.
- Sample encoding SHOULD be Signed 24-bit Little-Endian Integer or 32-bit Floating Point PCM, because samples are meant to be compressed and distorted so the least significant bits come into use. If RAM memory is a limitation, all samples can be converted to 16-bit with a scipt easily.
Good points!
As far as SuperDirt is concerned (I see that this may not matter)
- all samples are converted as read to 32-bit float (so memory makes no difference)
- all sampe rates are supported in principle (one may still want to restrict them for aesthetic reasons of course)
Some analysis of Dirt-Samples:

You can peek at the Python code.
DC-offset can be improved a lot! (Edit: true_peak() was wrong in the first place.)
Interesting! Dirt-Samples is mostly just the contents of my hard drive from ten years ago..
I just learned, 8-bit samples can possibly be unsigned. That file is also from Dirt-Samples:
$ soxi ./sugar/001_crab.wav
Input File : './sugar/001_crab.wav'
Channels : 1
Sample Rate : 11025
Precision : 8-bit
Duration : 00:00:17.18 = 189390 samples ~ 1288.37 CDDA sectors
File Size : 189k
Bit Rate : 88.2k
Sample Encoding: 8-bit Unsigned Integer PCM
With unsigned encoding the mean of a sinewave is not zero! The scipt above counted this file and some more as "with DC-offset".
Heh, those samples are probably ripped from a MAME rom. I think I made the 'mp3' ones by reading mp3 files as raw data.