audioset_tagging_cnn
audioset_tagging_cnn copied to clipboard
Assertion error and low MAP on bal/eval set
- I am getting the assertion error while running your script to create hdf5 files. It occurs in float32_to_int16() conversion. Here is a simplified version.
def float32_to_int16(x):
assert np.max(np.abs(x)) <= 1.
return (x * 32767.).astype(np.int16)
aud, sr = librosa.core.load(wav_files[0], sr=32000, mono=True)
aud = float32_to_int16(aud)
print (np.max(np.abs(aud)))
>>> 1.0048816
Some of my audio files are out of range. If I comment out the assertion then everything works. Will it be correct to remove the assertion?
- I am also getting a low MAP scores on balanced set and evaluation set by using your trained models.
The ResNet38
bal set :: 0.52
eval set :: 0.37
CNN10
bal set :: 0.48
eval set :: 0.32
Do you think the above issue has anything to do with it? I mean I prepare the data by commenting out the assertion.
Yes, you can remove the assertion. The result you got is correct. If trained on full dataset, it would be better.