paper-2015-esc-convnet
paper-2015-esc-convnet copied to clipboard
Input dimensions of audios shorter than 4 seconds on URBAN SOUND DATASET
I am personally working with URBAN SOUND DATASET and found your paper. In load_urban_sound
before rows_audio = [np.vstack(rows_audio)]
get the error:
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 88200 and the array at index 67 has size 176400
Indeed, some has sized 88200 and others have 176400.