ZeroSpeech
ZeroSpeech copied to clipboard
keyerror when preprocess data
I set the directory for data as datasets/2019/english, when I run the script preprocess.py, it raises
keyerror: 'accessing unknown key in a struct: dataset.in_dir'
but I can't find how to solve it.
Could you help me?
Hi @liu-x-p,
Sure. If you look at the usage in the readme it says:
python preprocess.py in_dir=/path/to/dataset dataset=[2019/english or 2019/surprise]
Note: in_dir
must be the path to the 2019
folder...
This is the folder that contains the wav
in it's subdirectories. So, for example, if I download the ZeroSpeech 2020 dataset and store it at ~/Documents/ZeroSpeech/2020
the command should be:
python preprocess.py in_dir=~/Documents/ZeroSpeech/2020/2019 dataset=2019/english
If you're still having trouble you please post the command you use and the path to your data directory.
Hope that helps!
@bshall Thank you!
I followed your settings for the command
python preprocess.py in_dir=/home/omnisky/mount/holiday/ZeroSpeech-0.1/datasets/2020/2019 dataset=2019/english
and the path is /home/omnisky/mount/holiday/ZeroSpeech-0.1/datasets/2020/2019, it contains 'english' and 'surprise'.
No problem @liu-x-p. If you're still having issues I'd advise keeping the actual data in a separate folder to this repo. So this repo would be under holiday/ZeroSpeech
for example and the actual wav files would be stored in holiday/RawData/2020
for example. Then in_dir
should point to .../holiday/RawData/2020/2019
.
On following the exact same procedure I am getting an error : hydra.errors.OverrideParseException: LexerNoViableAltException: Passport/VAE/ZeroSpeech/zerospeech_2020/2020/2019
. Could you kindly help me out? The directory path to wav files is Passport/VAE/ZeroSpeech/zerospeech_2020/2020/2019
and to the json files is Passport/VAE/ZeroSpeech/zerospeech_2020/datasets/2019/english
@liu-x-p Hi! I am also a Chinese student trying to run this repo and I am encountering some similar problems as you...TAT I wonder if you have successfully run this repo and could we have a discussion via e-mail... this is my email adress [email protected] Looking forward to your reply!
@ZhengRachel
I'm not sure about this as it has been so long time.
As you can see in my question and comment, I got this problem when I downloaded this work as ZeroSpeech-0.1, which I think may be a early version. And I downloaded it again, the ZeroSpeech-master branch, then it worked.
I think the command I used to run is python preprocess.py in_dir=../datasets/2020/2019 dataset=2019/english