kaldi-gstreamer-server
kaldi-gstreamer-server copied to clipboard
Error when recording from microphone
I am trying to record from microphone using this command: arecord -f S16_LE -r 16000 | python kaldigstserver/client.py -r 32000 -
However I get this error on the client output: Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 16000 Hz, Mono Received error from server (status 1)
output from server DEBUG 2019-04-04 12:13:58,638 Starting up server INFO 2019-04-04 12:14:12,266 101 GET /worker/ws/speech (127.0.0.1) 0.49ms INFO 2019-04-04 12:14:12,266 New worker available <main.WorkerSocketHandler object at 0x7fb7b10c90d0> INFO 2019-04-04 12:14:31,184 101 GET /client/ws/speech?content-type= (127.0.0.1) 0.39ms INFO 2019-04-04 12:14:31,185 bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9: OPEN INFO 2019-04-04 12:14:31,185 bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9: Request arguments: content-type="" INFO 2019-04-04 12:14:31,185 bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9: Using worker <main.DecoderSocketHandler object at 0x7fb7b10c9a90> INFO 2019-04-04 12:14:46,248 bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9: Sending event {u'status': 1, 'id': 'bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9'} to client INFO 2019-04-04 12:14:46,249 Worker <main.WorkerSocketHandler object at 0x7fb7b10c90d0> leaving INFO 2019-04-04 12:14:46,249 bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9: Handling on_connection_close() INFO 2019-04-04 12:14:46,249 bd36902b-a033-4e0e-8c29-1c7a4b6ec7f9: Closing worker connection INFO 2019-04-04 12:14:47,253 101 GET /worker/ws/speech (127.0.0.1) 0.38ms INFO 2019-04-04 12:14:47,253 New worker available <main.WorkerSocketHandler object at 0x7fb7b10c9510>
output from the worker
DEBUG 2019-04-04 12:14:10,074 Starting up worker
2019-04-04 12:14:10 - INFO: decoder2: Creating decoder using conf: {'post-processor': "perl -npe 'BEGIN {use IO::Handle; STDOUT->autoflush(1);} s/(.*)/\1./;'", 'logging': {'version': 1, 'root': {'level': 'DEBUG', 'handlers': ['console']}, 'formatters': {'simpleFormater': {'datefmt': '%Y-%m-%d %H:%M:%S', 'format': '%(asctime)s - %(levelname)7s: %(name)10s: %(message)s'}}, 'disable_existing_loggers': False, 'handlers': {'console': {'formatter': 'simpleFormater', 'class': 'logging.StreamHandler', 'level': 'DEBUG'}}}, 'use-vad': False, 'decoder': {'ivector-extraction-config': 'de_400k_nnet3chain_tdnn1f_2048_sp_bi/ivector_extractor/ivector_extractor.conf', 'lattice-beam': 5.0, 'acoustic-scale': 1.0, 'do-endpointing': True, 'beam': 5.0, 'mfcc-config': 'de_400k_nnet3chain_tdnn1f_2048_sp_bi/conf/mfcc_hires.conf', 'traceback-period-in-secs': 0.25, 'nnet-mode': 3, 'endpoint-silence-phones': '1:2:3:4:5:6', 'word-syms': 'de_400k_nnet3chain_tdnn1f_2048_sp_bi/words.txt', 'num-nbest': 10, 'frame-subsampling-factor': 3, 'phone-syms': 'de_400k_nnet3chain_tdnn1f_2048_sp_bi/phones.txt', 'max-active': 10000, 'fst': 'de_400k_nnet3chain_tdnn1f_2048_sp_bi/HCLG.fst', 'use-threaded-decoder': True, 'model': 'de_400k_nnet3chain_tdnn1f_2048_sp_bi/final.mdl', 'chunk-length-in-secs': 0.25}, 'silence-timeout': 15, 'out-dir': 'tmp', 'use-nnet2': True}
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: nnet-mode = 3
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: ivector-extraction-config = de_400k_nnet3chain_tdnn1f_2048_sp_bi/ivector_extractor/ivector_extractor.conf
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: lattice-beam = 5.0
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: acoustic-scale = 1.0
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: do-endpointing = True
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: beam = 5.0
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: mfcc-config = de_400k_nnet3chain_tdnn1f_2048_sp_bi/conf/mfcc_hires.conf
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: traceback-period-in-secs = 0.25
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: endpoint-silence-phones = 1:2:3:4:5:6
2019-04-04 12:14:10 - INFO: decoder2: Setting decoder property: word-syms = de_400k_nnet3chain_tdnn1f_2048_sp_bi/words.txt
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: num-nbest = 10
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: frame-subsampling-factor = 3
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: phone-syms = de_400k_nnet3chain_tdnn1f_2048_sp_bi/phones.txt
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: max-active = 10000
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: chunk-length-in-secs = 0.25
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: fst = de_400k_nnet3chain_tdnn1f_2048_sp_bi/HCLG.fst
2019-04-04 12:14:11 - INFO: decoder2: Setting decoder property: model = de_400k_nnet3chain_tdnn1f_2048_sp_bi/final.mdl
LOG ([5.5.266~1-77ac7]:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0172811 seconds in looped compilation.
2019-04-04 12:14:12 - INFO: decoder2: Created GStreamer elements
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstAppSrc object at 0x7f723b1385f0 (GstAppSrc at 0x5614e3fa81d0)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstDecodeBin object at 0x7f723b1385a0 (GstDecodeBin at 0x5614e3fb20e0)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstAudioConvert object at 0x7f723b138690 (GstAudioConvert at 0x5614e3fdbb10)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstAudioResample object at 0x7f723b138550 (GstAudioResample at 0x5614e3fdf8a0)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstTee object at 0x7f723b138640 (GstTee at 0x5614e3fe2000)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstQueue object at 0x7f723b138730 (GstQueue at 0x5614e3fe60d0)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstFileSink object at 0x7f723b138780 (GstFileSink at 0x5614e3fec1e0)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstQueue object at 0x7f723b1387d0 (GstQueue at 0x5614e3fe63d0)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.Gstkaldinnet2onlinedecoder object at 0x7f723b138820 (Gstkaldinnet2onlinedecoder at 0x5614e3fee140)> to the pipeline
2019-04-04 12:14:12 - DEBUG: decoder2: Adding <gi.GstFakeSink object at 0x7f723b138870 (GstFakeSink at 0x5614e4028bc0)> to the pipeline
2019-04-04 12:14:12 - INFO: decoder2: Linking GStreamer elements
LOG ([5.5.266~1-77ac7]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG ([5.5.266~1-77ac7]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
2019-04-04 12:14:12 - INFO: decoder2: Setting pipeline to READY
2019-04-04 12:14:12 - INFO: decoder2: Set pipeline to READY
2019-04-04 12:14:12 - INFO: main: Opening websocket connection to master server
2019-04-04 12:14:12 - INFO: main: Opened websocket connection to server
2019-04-04 12:14:31 - DEBUG: main:
Try running (stop recording with ctrl+c
)
arecord -f S16_LE -r 16000 > speech.wav
And then play it back with a player you now it's working, try VLC for instance:
vlc speech.wav
- Does it play the audio you recorded or is it just noise or silence?
- If it does have actual recorded audio, does it contain speech? (notice the worker complains about the absence of speech in the error you posted)
Here I have complete and working code for decoding from microphone, python 2 and 3 are supported. Hope this help someone