panns_transfer_to_gtzan icon indicating copy to clipboard operation
panns_transfer_to_gtzan copied to clipboard

problem with audio tagging when use inference code

Open aliakbartaghizadeh opened this issue 4 years ago • 2 comments

Hello Thanks for sharing codes I appreciate your consideration In this readme file, you haven't written anything about how we can inference a new file and tag it with the transfer_cnn14 model. so I decide to use the way that you wrote in the audioset_tagging_cnn-master readme file but, I get this error that shows in the below screenshot. How can I solve this problem? Do you have any special inference code for transfer learning? (if yes can you upload it please.) Screenshot from 2020-12-04 16-45-22

aliakbartaghizadeh avatar Dec 04 '20 13:12 aliakbartaghizadeh

I also encountered the same problem, have you solved it

AntyRia avatar Oct 10 '23 01:10 AntyRia

I solved this with following steps:

  1. Copy the inference.py from the other repo to this panns_transfer_to_gtzan
  2. Use model type Transfer_Cnn14
  3. Add freeze_base=True when declare the model.
  4. Run the inference.py

I found the probability is negative. And the model in this repo does not have framewise_output so it does not work for the sound_event_detection feature. But I think this kaggle notebook will have hint.

Example for blues.00005.wav:

GPU number: 1
blues: -0.154
rock: -2.781
country: -3.707
hiphop: -4.217
jazz: -4.519
reggae: -4.669
metal: -4.958
pop: -5.078
disco: -5.450
classical: -5.516
embedding: (2048,)

baicaigithub avatar Dec 19 '23 05:12 baicaigithub