PerceptualAudio_Pytorch Issue about the dataset

Hi Adrien Thank you for sharing such good work and I wonder how and where can I get the datasets of these code?

Oct 14 '21 11:10 spacecowboy11111

Hi, The paper is from Pranay Manocha and you can access the data from his official repo https://github.com/pranaymanocha/PerceptualAudio/tree/master/dataset

Oct 14 '21 12:10 adrienchaton

Hi Adrien Thank you for replying. I have downloaded that data_saved.npy document. But I see there need four subsets in your code, so I don't know how to deal with that data_saved.npy. should I divide it into four subsets and I don't know how to use those four txt documents. Could you please tell what I should do if I want to train the model? Thank you!

---Original--- From: @.> Date: Thu, Oct 14, 2021 20:14 PM To: @.>; Cc: @.@.>; Subject: Re: [adrienchaton/PerceptualAudio_Pytorch] Issue about the dataset (#3)

Hi, The paper is from Pranay Manocha and you can access the data from his official repo https://github.com/pranaymanocha/PerceptualAudio/tree/master/dataset

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Oct 14 '21 12:10 spacecowboy11111

It's a while ago that I downloaded the dataset, not sure which link I used since there are different ones on the official repo http://percepaudio.cs.princeton.edu/icassp2020_perceptual/audio_perception.zip http://percepaudio.cs.princeton.edu/icassp2020_perceptual/data_saved.npy I would think I started from the .zip

In any case it is not so important to have split the different perturbations into subsets = ['dataset_combined','dataset_eq','dataset_linear','dataset_reverb']

You can overwrite https://github.com/adrienchaton/PerceptualAudio_Pytorch/blob/master/train.py#L82 and make a single dataset with subsets = ['dataset']

Then you should load in https://github.com/adrienchaton/PerceptualAudio_Pytorch/blob/master/utils.py#L40 a npy dictionnary named data_path+'dataset_data.npy'

and in this dictionary you put the files you want to train on as separate entries [first signal, second signal, human label]

first signal should be the reference signal second signal should be the perturbed signal human label should be the rating whether the sounds are judged dissimilar (1) or not (0)

I dont work on this repo anymore but this what I remember quickly while looking back into it so I would recommend going through the code yourself and checking all behaves fine to your understanding.

Oct 14 '21 13:10 adrienchaton

Thank you very much !

---Original--- From: @.> Date: Thu, Oct 14, 2021 21:33 PM To: @.>; Cc: @.@.>; Subject: Re: [adrienchaton/PerceptualAudio_Pytorch] Issue about the dataset (#3)