kaldiio
kaldiio copied to clipboard
read to download sample wav.scp file(include pipe sox)
Hi all, I want to use the kaldiio library to read wav.scp and segments file,but in wav.scp file,It contains pipe commands like the following: ui23faz_0101 /usr/bin/sox /path/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample |" the kaldiio reader is not working. Does kaldiio not support such wav.scp?
Thank you for using our tool. Could show me the error log?
Thank you for your reply,
this is my error log:
Colocations handled automatically by placer.
2019-05-08 19:51:53.221278: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-05-08 19:51:53.225237: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000125000 Hz
2019-05-08 19:51:53.225357: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5555599691e0 executing computations on platform Host. Devices:
2019-05-08 19:51:53.225375: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0):
Maybe your wav file has some problem. kaldiio just uses scipy for loading wav file, so you can check it as following:
/usr/bin/sox /path/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample > out.wav
python
>>> import scipy.io.wavfile
>>> scipy.io.wavfile.read('out.wav')
Thanks for your reply. I use your method to test, my wav file is no problem. The test results are as follows: /usr/bin/sox /home4/md510/w2018/original_seame/wavdata/interview/ui23faz_0101/ui23faz_0101.wav -r 16000 -c 1 -b 16 -t wav - downsample > out.wav /usr/bin/sox WARN rate: rate clipped 17 samples; decrease volume? /usr/bin/sox WARN dither: dither clipped 17 samples; decrease volume? python3
import scipy.io.wavfile scipy.io.wavfile.read('out.wav') (16000, array([ -1, 1, -1, ..., -17, -5, 4], dtype=int16))
Your wave file has incorrect file size information in the header and scipy.io.wavfile doesn't support such wave file.
/usr/bin/sox WARN wav: Premature EOF on .wav input file
I changed to use wave
module in new kaldiio now. Try pip install -U kaldiio.
Thank you, I upgraded the kaldiio library as you suggested. In addition, mel-fbank is generated in 6-hour small data set and written into kaldi's ark and SCP file format. It is generated in 10 processes, one hour and four minutes. But I switched to a larger data set (96 hours) and 32 processes. The program has not finished running for 30 hours. Is it the beginning of kaldiio's reading and writing efficiency slowly changing with time?
Maybe, simple reading without segments
file can performs not so slowly comparing with kaldi, because it is just using subprocess
for invoking commands and scipy/python-wave
, but I haven't optimized it for segments.
Could you tell me more information in your case - how long are each wave files and how long are segments in the wave files? If you could, attaching the scp file and semgents would help me.
Thank you for your reply. I used this 96-hour data set and it worked well in kaldi, but I used the read-write matrix interface of kaldiio to run for three days without extracting the features. According to your request, I explained my data set, the wave length is about 1-2 hours, and the segments length is about 2-7 seconds.
I created test set almost matching your corpus, but in my environment, it doesn't take such a long time. It performed as same speed as kaldi itself.
I was curious that your logging included tensorflow's message.Are you trying to extract the feature from wavfile in training script?
In general, the invoking subprocess takes much long time if a large mount of memory are allocated.
For example,
import numpy
import subprocess
import time
t = time.time()
subprocess.run('echo hello', shell=True)
print(f'{time.time() - t} [x]')
x = numpy.ones((100000000,))
t = time.time()
# Take much more time
subprocess.run('echo hello', shell=True)
print(f'{time.time() - t} [x]')
This is not the fault of python's subprocess, but fork() system call has such feature.
Thus, if you'll invoke sox
via wav.scp
, you need to care not to allocate extra memory as possible.
Thanks for your reply, I'm going to check code somewhere else.