icefall
icefall copied to clipboard
Issues when I try to run the yesno recipe for testing purposes
Hi! After installing all the dependencies for icefall, I tried to run the yesno recipe in order to test whether the installation is successful. During the data preparation stage, I get this error:
stek@99a34efcadb3:/cappellazzo/icefall_repo/icefall/egs/yesno/ASR$ ./prepare.sh
2023-06-02 09:47:46 (prepare.sh:27:main) dl_dir: /cappellazzo/icefall_repo/icefall/egs/yesno/ASR/download
2023-06-02 09:47:46 (prepare.sh:30:main) Stage 0: Download data
/cappellazzo/icefall_repo/icefall/egs/yesno/ASR/download/waves_yesno.tar.gz: 100%|███| 4.70M/4.70M [00:00<00:00, 30.5MB/s]
2023-06-02 09:47:48 (prepare.sh:39:main) Stage 1: Prepare yesno manifest
2023-06-02 09:47:50 (prepare.sh:45:main) Stage 2: Compute fbank for yesno
2023-06-02 09:47:52,082 INFO [compute_fbank_yesno.py:65] Processing train
Extracting and storing features: 100%|███████████████████████████████████████████████████| 90/90 [00:00<00:00, 256.03it/s]
2023-06-02 09:47:52,457 INFO [compute_fbank_yesno.py:65] Processing test
Extracting and storing features: 100%|███████████████████████████████████████████████████| 30/30 [00:00<00:00, 354.99it/s]
2023-06-02 09:47:52 (prepare.sh:51:main) Stage 3: Prepare lang
2023-06-02 09:47:55 (prepare.sh:63:main) Stage 4: Prepare G
/project/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):79
[I] Reading \data\ section.
/project/kaldilm/csrc/arpa_file_parser.cc:void kaldilm::ArpaFileParser::Read(std::istream&):140
[I] Reading \1-grams: section.
2023-06-02 09:47:55 (prepare.sh:89:main) Stage 5: Compile HLG
2023-06-02 09:47:56,632 INFO [compile_hlg.py:124] Processing data/lang_phone
2023-06-02 09:47:56,633 INFO [lexicon.py:171] Converting L.pt to Linv.pt
2023-06-02 09:47:56,638 INFO [compile_hlg.py:48] Building ctc_topo. max_token_id: 3
2023-06-02 09:47:56,638 INFO [compile_hlg.py:52] Loading G.fst.txt
2023-06-02 09:47:56,639 INFO [compile_hlg.py:62] Intersecting L and G
2023-06-02 09:47:56,639 INFO [compile_hlg.py:64] LG shape: (4, None)
2023-06-02 09:47:56,639 INFO [compile_hlg.py:66] Connecting LG
2023-06-02 09:47:56,639 INFO [compile_hlg.py:68] LG shape after k2.connect: (4, None)
2023-06-02 09:47:56,640 INFO [compile_hlg.py:70] <class 'torch.Tensor'>
2023-06-02 09:47:56,640 INFO [compile_hlg.py:71] Determinizing LG
2023-06-02 09:47:56,640 INFO [compile_hlg.py:74] <class '_k2.ragged.RaggedTensor'>
2023-06-02 09:47:56,640 INFO [compile_hlg.py:76] Connecting LG after k2.determinize
2023-06-02 09:47:56,640 INFO [compile_hlg.py:79] Removing disambiguation symbols on LG
2023-06-02 09:47:56,641 INFO [compile_hlg.py:91] LG shape after k2.remove_epsilon: (6, None)
Traceback (most recent call last):
File "/cappellazzo/icefall_repo/icefall/egs/yesno/ASR/./local/compile_hlg.py", line 136, in <module>
main()
File "/cappellazzo/icefall_repo/icefall/egs/yesno/ASR/./local/compile_hlg.py", line 126, in main
HLG = compile_HLG(lang_dir)
File "/cappellazzo/icefall_repo/icefall/egs/yesno/ASR/./local/compile_hlg.py", line 93, in compile_HLG
LG = k2.connect(LG)
File "/opt/conda/lib/python3.10/site-packages/k2/fsa_algo.py", line 522, in connect
if fsa.properties & fsa_properties.ACCESSIBLE != 0 and \
File "/opt/conda/lib/python3.10/site-packages/k2/fsa.py", line 446, in properties
raise RuntimeError(
RuntimeError: The fsa attribute (labels) has been inappropriately modified like:
fsa.labels[xxx] = yyy
The correct way should be like:
labels = fsa.labels
labels[xxx] = yyy
fsa.labels = labels
Any idea on how to fix it? I don't know whether it depends on my installation or it is a issue with the recipe itself. Thank you
Please install the latest k2.
I used the following command for installing k2 but it does not install the latest version.
$ conda install -c k2-fsa -c pytorch -c nvidia k2 pytorch=1.13.0 pytorch-cuda=11.7 python=3.8
It does install version 1.23.4 and not the 1.24.1
We have prebuilt wheels. You can find them in the installation doc. Please use pip install or compile k2 from source.
I opted for conda installation since in the official documentation it's recommended that we install k2 using conda
Sorry, the conda packages have not been updated.
In the end I managed to install it via pip. However, I would recommend that you update the installation instructions, at least put the "recommended" to the pip installation rather than conda, it would save time for the upcoming users.
After carrying out data preparation for LibriSpeech (it is successful), I'm trying to run the training stage for conformer_ctc2 (./conformer_ctc/train.py --full-libri False --num-epochs 30), but I get this error:
2023-06-06 13:24:41,697 INFO [train.py:770] Sanity check -- see if any of the batches in epoch 0 would cause OOM.
[W] /stek/matasso/k2/build/temp.linux-x86_64-cpython-310/k2/csrc/pytorch_context.cc:81:void k2::InitHasCuda() k2 was not compiled with CUDA. Return a CPU context.
Segmentation fault (core dumped)
Any solution to this error? Can it be related to the installation phase?
The log says you have installed a CPU version of k2. Please switch to a CUDA version.
Please show the output of
python3 -m k2.version
and please describe how you installed k2.
My colleague and I tried to start from a new pytorch docker image and installed all the dependencies again, and it does work. Probably some days ago we had some conflicts due to the multiple attempts to install k2, and the CUDA libraries were not visible to k2 or something similar. Btw, now I'm running a recipe for librispeech and it's working properly. I'm gonna close the issue in a few days if everything goes smoothly. Thank you!
Hi, I ran the conformer_ctc recipe with the following command:
./conformer_ctc/train.py --full-libri False --num-epochs 30
and the training was ok and also the decoding step was ok. With this configuration, the architecture entails a transformer decoder, yet now I'd like to re-run the same recipe with only ctc, thus dispensing with the decoder. I was trying this:
./conformer_ctc/train.py --num-decoder-layers 0 --full-libri False --num-epochs 30
because I read in the --help that --num-decoder-layers "Setting this to 0 will not create the decoder at all (pure CTC model)". Anyhow, I get this error:
root@4677c86aefb3:/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR# ./conformer_ctc/train.py --num-decoder-layers 0 --full-libri False --num-epochs 30
fatal: detected dubious ownership in repository at '/stek/cappellazzo/JSALT2023/icefall'
To add an exception for this directory, call:
git config --global --add safe.directory /stek/cappellazzo/JSALT2023/icefall
fatal: detected dubious ownership in repository at '/stek/cappellazzo/JSALT2023/icefall'
To add an exception for this directory, call:
git config --global --add safe.directory /stek/cappellazzo/JSALT2023/icefall
fatal: detected dubious ownership in repository at '/stek/cappellazzo/JSALT2023/icefall'
To add an exception for this directory, call:
git config --global --add safe.directory /stek/cappellazzo/JSALT2023/icefall
2023-06-08 12:01:38,958 INFO [train.py:610] Training started
2023-06-08 12:01:38,959 INFO [train.py:611] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'use_feat_batchnorm': True, 'attention_dim': 512, 'nhead': 8, 'beam_size': 10, 'reduction': 'sum', 'use_double_scores': True, 'weight_decay': 1e-06, 'warm_step': 80000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '', 'k2-git-date': '', 'lhotse-version': '1.16.0.dev+git.cf4446d.clean', 'torch-version': '2.0.1', 'torch-cuda-available': True, 'torch-cuda-version': '11.7', 'python-version': '3.1', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/stek/cappellazzo/JSALT2023/icefall', 'k2-path': '/opt/conda/lib/python3.10/site-packages/k2-1.24.3.dev20230606+cuda11.7.torch2.0.1-py3.10-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/opt/conda/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': '4677c86aefb3', 'IP address': '172.17.0.16'}, 'world_size': 1, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 30, 'start_epoch': 0, 'exp_dir': PosixPath('conformer_ctc/exp'), 'lang_dir': PosixPath('data/lang_bpe_500'), 'att_rate': 0.8, 'num_decoder_layers': 0, 'lr_factor': 5.0, 'seed': 42, 'full_libri': False, 'mini_libri': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures'}
2023-06-08 12:01:39,281 INFO [lexicon.py:168] Loading pre-compiled data/lang_bpe_500/Linv.pt
2023-06-08 12:01:39,675 INFO [train.py:659] About to create model
2023-06-08 12:01:41,703 INFO [asr_datamodule.py:413] About to get train-clean-100 cuts
2023-06-08 12:01:41,704 INFO [asr_datamodule.py:232] Enable MUSAN
2023-06-08 12:01:41,704 INFO [asr_datamodule.py:233] About to get Musan cuts
2023-06-08 12:01:43,705 INFO [asr_datamodule.py:257] Enable SpecAugment
2023-06-08 12:01:43,705 INFO [asr_datamodule.py:258] Time warp factor: 80
2023-06-08 12:01:43,705 INFO [asr_datamodule.py:268] Num frame mask: 10
2023-06-08 12:01:43,706 INFO [asr_datamodule.py:281] About to create train dataset
2023-06-08 12:01:43,706 INFO [asr_datamodule.py:308] Using DynamicBucketingSampler.
2023-06-08 12:01:46,161 INFO [asr_datamodule.py:323] About to create train dataloader
2023-06-08 12:01:46,162 INFO [asr_datamodule.py:451] About to get dev-clean cuts
2023-06-08 12:01:46,163 INFO [asr_datamodule.py:458] About to get dev-other cuts
2023-06-08 12:01:46,163 INFO [asr_datamodule.py:354] About to create dev dataset
2023-06-08 12:01:46,355 INFO [asr_datamodule.py:371] About to create dev dataloader
2023-06-08 12:01:46,355 INFO [train.py:770] Sanity check -- see if any of the batches in epoch 0 would cause OOM.
Traceback (most recent call last):
File "/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 819, in <module>
main()
File "/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 812, in main
run(rank=0, world_size=1, args=args)
File "/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 714, in run
scan_pessimistic_batches_for_oom(
File "/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 778, in scan_pessimistic_batches_for_oom
loss, _ = compute_loss(
File "/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 424, in compute_loss
att_loss = mmodel.decoder_forward(
File "/stek/cappellazzo/JSALT2023/icefall/egs/librispeech/ASR/conformer_ctc/transformer.py", line 287, in decoder_forward
tgt = self.decoder_embed(ys_in_pad) # (N, T) -> (N, T, C)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Conformer' object has no attribute 'decoder_embed'. Did you mean: 'encoder_embed'?
Any clue? I don't know if setting --num-decoder-layers 0 is enough for running pure ctc training.
In order to run "pure-ctc" training, both --num-encoder-layers and --att-rate MUST be set to 0 and 0. , respectively. Only setting --num-encoder-layers 0 results in the error I attached above.