deepspeech.pytorch.ko
deepspeech.pytorch.ko copied to clipboard
usage
Thank you for great work done
I was following on speech.ko repo to preprocess, deep speech.pytorch repo with preprocess, preparemetafile . I was wondering if this repo is independent to run with only Korean open data *.zip file (same data as speech.ko)
After run this repo just considering that raw zipped datasets, there are some $home/corpus or .txt path problems (can't find file) just running data/nikl.py I want to know directory structure that is running in this repo.
With prepared datasets, I wonder where to continue on this repo and which to skip. Also, can I have to implement the original deep speech.pytorch repo with Korean frontend process that has Korean-cleaners? Thanks!!
I had not made pull request because this was personal project. For the directory structure, the following will help you out.
https://github.com/homink/deepspeech.pytorch.ko/blob/c09b17925472551518c590fd0ac954f9d706728b/data/nikl.py#L12
I had not made pull request because this was personal project. For the directory structure, the following will help you out.
https://github.com/homink/deepspeech.pytorch.ko/blob/c09b17925472551518c590fd0ac954f9d706728b/data/nikl.py#L12
Does that nikl_dataset mean raw_downloaded zip file or after process of preprocess.py in deep voice.pytorch ??
I have difficulty putting below $HOME/copora/NIKL directory .. Thanks! subprocess.call(["local/clean_corpus.sh","$HOME/copora/NIKL",args.target_dir]) subprocess.call(["local/data_prep.sh","$HOME/copora/NIKL",args.target_dir])
명령어를 입력하는 디렉토리 path가 어떻게 되세요?
speech.ko ㄴzip파일들 ㄴtrimmed data ㄴmetadata.txt ㄴf101 ㄴf102
deepvoice.pytorch ㄴdata ㄴnikl.m ㄴmultimel.npy들 ㄴmultispec.npy들 _______________________________________기존 레포들에서 data들 trimmed_data와 npy만드는 부분 진행했습니다
deepspeechtorch.ko(current repo) ㄴdata ㄴlocal ㄴraw zip파일들 원래 받은 30대여성_.zip파일들을 다시 올려놓았습니다 local에서 clean_corpus.sh부분에
unzip하는 부분을 주석해제 하고 런했습니다 unzip후 inflating,mov 등이 speech.ko과정처럼 전처리가 되다가 $HOME/copora/NIKL 이부분을 찾지 못해 하위 metadata.txt, 등등을 못 찾는다고 합니다 #NIKL corpus consists of several zip files. #You can organize folders into your corpus directory with the following commands unzip '.zip' mv -f "3-3(50female)"/ ./ mv -f "3-3(50male)"/* ./ rm -rf "3-3(50female)" "3-3(50male)" #You can delete corpus with the following comand and unzip again if necessary. rm -rf Bad* Non* f* m* .txt .hwp script speak