self-supervised-speech-recognition Error in Making prediction

Hi all

After following Install Instruction and downloading your Pre-trained models I executed this code in colab:

from stt import Transcriber
transcriber = Transcriber(pretrain_model = '/content/vietnamese_wav2vec/pretrain.pt', finetune_model = '/content/vietnamese_wav2vec/finetune.pt', 
                          dictionary = '/content/vietnamese_wav2vec/dict.ltr.txt',
                          lm_type = 'kenlm',
                          lm_lexicon = '/content/vietnamese_wav2vec/lexicon.txt', lm_model = '/content/vietnamese_wav2vec/lm.bin',
                          lm_weight = 1.5, word_score = -1, beam_size = 50)
hypos = transcriber.transcribe(['/content/1000.wav'])
print(hypos)

But it gives me this error:

usage: ipykernel_launcher.py [-h] [--no-progress-bar]
                             [--log-interval LOG_INTERVAL]
                             [--log-format {json,none,simple,tqdm}]
                             [--tensorboard-logdir TENSORBOARD_LOGDIR]
                             [--wandb-project WANDB_PROJECT]
                             [--azureml-logging] [--seed SEED] [--cpu] [--tpu]
                             [--bf16] [--memory-efficient-bf16] [--fp16]
                             [--memory-efficient-fp16]
                             [--fp16-no-flatten-grads]
                             [--fp16-init-scale FP16_INIT_SCALE]
                             [--fp16-scale-window FP16_SCALE_WINDOW]
                             [--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
                             [--min-loss-scale MIN_LOSS_SCALE]
                             [--threshold-loss-scale THRESHOLD_LOSS_SCALE]
                             [--user-dir USER_DIR]
                             [--empty-cache-freq EMPTY_CACHE_FREQ]
                             [--all-gather-list-size ALL_GATHER_LIST_SIZE]
                             [--model-parallel-size MODEL_PARALLEL_SIZE]
                             [--quantization-config-path QUANTIZATION_CONFIG_PATH]
                             [--profile] [--reset-logging]
                             [--criterion {cross_entropy,label_smoothed_cross_entropy,composite_loss,sentence_ranking,label_smoothed_cross_entropy_with_alignment,legacy_masked_lm_loss,wav2vec,masked_lm,ctc,sentence_prediction,nat_loss,model,adaptive_loss,vocab_parallel_cross_entropy}]
                             [--tokenizer {space,moses,nltk}]
                             [--bpe {bert,byte_bpe,gpt2,hf_byte_bpe,fastbpe,subword_nmt,sentencepiece,bytes,characters}]
                             [--optimizer {adamax,adadelta,nag,adagrad,adafactor,adam,sgd,composite,lamb}]
                             [--lr-scheduler {inverse_sqrt,triangular,cosine,fixed,tri_stage,reduce_lr_on_plateau,polynomial_decay,manual,pass_through}]
                             [--scoring {sacrebleu,bleu,chrf,wer}]
                             [--task TASK] [--num-workers NUM_WORKERS]
                             [--skip-invalid-size-inputs-valid-test]
                             [--max-tokens MAX_TOKENS]
                             [--batch-size BATCH_SIZE]
                             [--required-batch-size-multiple REQUIRED_BATCH_SIZE_MULTIPLE]
                             [--required-seq-len-multiple REQUIRED_SEQ_LEN_MULTIPLE]
                             [--dataset-impl {raw,lazy,cached,mmap,fasta}]
                             [--data-buffer-size DATA_BUFFER_SIZE]
                             [--train-subset TRAIN_SUBSET]
                             [--valid-subset VALID_SUBSET]
                             [--validate-interval VALIDATE_INTERVAL]
                             [--validate-interval-updates VALIDATE_INTERVAL_UPDATES]
                             [--validate-after-updates VALIDATE_AFTER_UPDATES]
                             [--fixed-validation-seed FIXED_VALIDATION_SEED]
                             [--disable-validation]
                             [--max-tokens-valid MAX_TOKENS_VALID]
                             [--batch-size-valid BATCH_SIZE_VALID]
                             [--curriculum CURRICULUM]
                             [--gen-subset GEN_SUBSET]
                             [--num-shards NUM_SHARDS] [--shard-id SHARD_ID]
                             [--distributed-world-size DISTRIBUTED_WORLD_SIZE]
                             [--distributed-rank DISTRIBUTED_RANK]
                             [--distributed-backend DISTRIBUTED_BACKEND]
                             [--distributed-init-method DISTRIBUTED_INIT_METHOD]
                             [--distributed-port DISTRIBUTED_PORT]
                             [--device-id DEVICE_ID] [--distributed-no-spawn]
                             [--ddp-backend {c10d,no_c10d}]
                             [--bucket-cap-mb BUCKET_CAP_MB]
                             [--fix-batches-to-gpus]
                             [--find-unused-parameters] [--fast-stat-sync]
                             [--broadcast-buffers]
                             [--distributed-wrapper {DDP,SlowMo}]
                             [--slowmo-momentum SLOWMO_MOMENTUM]
                             [--slowmo-algorithm SLOWMO_ALGORITHM]
                             [--localsgd-frequency LOCALSGD_FREQUENCY]
                             [--nprocs-per-node NPROCS_PER_NODE]
                             [--pipeline-model-parallel]
                             [--pipeline-balance PIPELINE_BALANCE]
                             [--pipeline-devices PIPELINE_DEVICES]
                             [--pipeline-chunks PIPELINE_CHUNKS]
                             [--pipeline-encoder-balance PIPELINE_ENCODER_BALANCE]
                             [--pipeline-encoder-devices PIPELINE_ENCODER_DEVICES]
                             [--pipeline-decoder-balance PIPELINE_DECODER_BALANCE]
                             [--pipeline-decoder-devices PIPELINE_DECODER_DEVICES]
                             [--pipeline-checkpoint {always,never,except_last}]
                             [--zero-sharding {none,os}] [--path PATH]
                             [--post-process [POST_PROCESS]] [--quiet]
                             [--model-overrides MODEL_OVERRIDES]
                             [--results-path RESULTS_PATH] [--beam BEAM]
                             [--nbest NBEST] [--max-len-a MAX_LEN_A]
                             [--max-len-b MAX_LEN_B] [--min-len MIN_LEN]
                             [--match-source-len] [--unnormalized]
                             [--no-early-stop] [--no-beamable-mm]
                             [--lenpen LENPEN] [--unkpen UNKPEN]
                             [--replace-unk [REPLACE_UNK]] [--sacrebleu]
                             [--score-reference] [--prefix-size PREFIX_SIZE]
                             [--no-repeat-ngram-size NO_REPEAT_NGRAM_SIZE]
                             [--sampling] [--sampling-topk SAMPLING_TOPK]
                             [--sampling-topp SAMPLING_TOPP]
                             [--constraints [{ordered,unordered}]]
                             [--temperature TEMPERATURE]
                             [--diverse-beam-groups DIVERSE_BEAM_GROUPS]
                             [--diverse-beam-strength DIVERSE_BEAM_STRENGTH]
                             [--diversity-rate DIVERSITY_RATE]
                             [--print-alignment [{hard,soft}]] [--print-step]
                             [--lm-path LM_PATH] [--lm-weight LM_WEIGHT]
                             [--iter-decode-eos-penalty ITER_DECODE_EOS_PENALTY]
                             [--iter-decode-max-iter ITER_DECODE_MAX_ITER]
                             [--iter-decode-force-max-iter]
                             [--iter-decode-with-beam ITER_DECODE_WITH_BEAM]
                             [--iter-decode-with-external-reranker]
                             [--retain-iter-history] [--retain-dropout]
                             [--retain-dropout-modules RETAIN_DROPOUT_MODULES]
                             [--decoding-format {unigram,ensemble,vote,dp,bs}]
                             [--no-seed-provided] [--save-dir SAVE_DIR]
                             [--restore-file RESTORE_FILE]
                             [--finetune-from-model FINETUNE_FROM_MODEL]
                             [--reset-dataloader] [--reset-lr-scheduler]
                             [--reset-meters] [--reset-optimizer]
                             [--optimizer-overrides OPTIMIZER_OVERRIDES]
                             [--save-interval SAVE_INTERVAL]
                             [--save-interval-updates SAVE_INTERVAL_UPDATES]
                             [--keep-interval-updates KEEP_INTERVAL_UPDATES]
                             [--keep-last-epochs KEEP_LAST_EPOCHS]
                             [--keep-best-checkpoints KEEP_BEST_CHECKPOINTS]
                             [--no-save] [--no-epoch-checkpoints]
                             [--no-last-checkpoints]
                             [--no-save-optimizer-state]
                             [--best-checkpoint-metric BEST_CHECKPOINT_METRIC]
                             [--maximize-best-checkpoint-metric]
                             [--patience PATIENCE]
                             [--checkpoint-suffix CHECKPOINT_SUFFIX]
                             [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
                             [--load-checkpoint-on-all-dp-ranks]
                             [--kspmodel KSPMODEL] [--wfstlm WFSTLM]
                             [--rnnt_decoding_type RNNT_DECODING_TYPE]
                             [--rnnt_len_penalty RNNT_LEN_PENALTY]
                             [--w2l-decoder W2L_DECODER] [--lexicon LEXICON]
                             [--unit-lm] [--kenlm-model KENLM_MODEL]
                             [--beam-threshold BEAM_THRESHOLD]
                             [--beam-size-token BEAM_SIZE_TOKEN]
                             [--word-score WORD_SCORE]
                             [--unk-weight UNK_WEIGHT]
                             [--sil-weight SIL_WEIGHT]
                             [--dump-emissions DUMP_EMISSIONS]
                             [--dump-features DUMP_FEATURES]
                             [--load-emissions LOAD_EMISSIONS] [-s SRC]
                             [-t TARGET] [--load-alignments]
                             [--left-pad-source BOOL] [--left-pad-target BOOL]
                             [--max-source-positions N]
                             [--max-target-positions N]
                             [--upsample-primary UPSAMPLE_PRIMARY]
                             [--truncate-source] [--num-batch-buckets N]
                             [--eval-bleu] [--eval-bleu-detok EVAL_BLEU_DETOK]
                             [--eval-bleu-detok-args JSON]
                             [--eval-tokenized-bleu]
                             [--eval-bleu-remove-bpe [EVAL_BLEU_REMOVE_BPE]]
                             [--eval-bleu-args JSON]
                             [--eval-bleu-print-samples]
                             [--force-anneal FORCE_ANNEAL]
                             [--lr-shrink LR_SHRINK]
                             [--warmup-updates WARMUP_UPDATES] [--pad PAD]
                             [--eos EOS] [--unk UNK]
                             data
ipykernel_launcher.py: error: unrecognized arguments: -f /mnt/disks2/data /mnt/disks2/data

An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py:2890: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

and %tb gives this:


---------------------------------------------------------------------------

SystemExit                                Traceback (most recent call last)

<ipython-input-46-24d37fe9d36e> in <module>()
      4                           lm_type = 'kenlm',
      5                           lm_lexicon = 'path/to/lm/lexicon.txt', lm_model = 'path/to/lm/lm.bin',
----> 6                           lm_weight = 1.5, word_score = -1, beam_size = 50)
      7 hypos = transcriber.transcribe(['path/to/wavs/0_1.wav','path/to/wavs/0_2.wav'])
      8 print(hypos)

4 frames

/usr/lib/python3.7/argparse.py in exit(self, status, message)
   2502         if message:
   2503             self._print_message(message, _sys.stderr)
-> 2504         _sys.exit(status)
   2505 
   2506     def error(self, message):

SystemExit: 2

Why is this happening? Could you please help me with that?

Mar 12 '21 08:03 mehdihosseinimoghadam

Can you try to add these lines above the inference script: import sys sys.argv = ['']

Mar 12 '21 11:03 mailong25

I got similar issue, I run exactly what you told:

%cd '/content/self-supervised-speech-recognition/' import sys sys.argv = ['']

from stt import Transcriber transcriber = Transcriber(pretrain_model = '/content/pretrained/pretrain.pt', finetune_model = '/content/pretrained/finetune.pt', dictionary = '/content/pretrained/dict.ltr.txt', lm_type = 'kenlm', lm_lexicon = '/content/pretrained/lexicon.txt', lm_model = '/content/pretrained/lm.bin', lm_weight = 1.5, word_score = -1, beam_size = 50) hypos = transcriber.transcribe('/content/self-supervised-speech-recognition/wavs/0_1.wav') print(hypos)

This is what I got:

: error: the following arguments are required: data An exception has occurred, use %tb to see the full traceback. SystemExit: 2 /usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py:2890: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D. warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

Could you please help me with that?

Jul 09 '21 09:07 lethanhson9901

@Lethanhson9901 You can try: !cp -r /content/self-supervised-speech-recognition/libs/fairseq/examples /usr/local/lib/python3.7/dist-packages

Sep 22 '21 18:09 martinhoang11

self-supervised-speech-recognition self-supervised-speech-recognition copied to clipboard

Error in Making prediction

self-supervised-speech-recognition
self-supervised-speech-recognition copied to clipboard