fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

wav2vec2-object of type 'NoneType' has no len()

Open L1xxer opened this issue 1 year ago • 2 comments

🐛 Bug(I have seen all issue about this error. But My situation is different.)

I follow the offical method (https://github.com/facebookresearch/fairseq/tree/main/examples/wav2vec) to use other datasets to fine-tune wav2vec_small_10m.pt. However, there is a TypeError: object of type 'NoneType' has no len()

Traceback (most recent call last):
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq_cli\hydra_train.py", line 27, in hydra_main
    _hydra_main(cfg)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq_cli\hydra_train.py", line 56, in _hydra_main
    distributed_utils.call_main(cfg, pre_main, **kwargs)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\distributed\utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq_cli\train.py", line 96, in main
    model = task.build_model(cfg.model)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\audio_finetuning.py", line 193, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\audio_pretraining.py", line 197, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\fairseq_task.py", line 338, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\wav2vec\wav2vec2_asr.py", line 208, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\wav2vec\wav2vec2_asr.py", line 407, in __init__
    model = task.build_model(w2v_args.model, from_checkpoint=True)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\audio_pretraining.py", line 197, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\fairseq_task.py", line 338, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\wav2vec\wav2vec2_asr.py", line 208, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
TypeError: object of type 'NoneType' has no len()

To Reproduce

  1. Get dict.ltr.txt of the datasets I used.
| 345995
E 149656
T 126396
O 126124
A 105266
I 98021
N 87693
H 78905
S 75716
R 65274
L 56135
U 53373
Y 49207
D 47160
W 37983
M 37210
G 34109
C 28421
F 21527
B 21383
K 20813
P 20423
' 19381
V 12276
J 4387
X 1863
Z 1067
Q 597
  1. Modify the filebase_100h.yaml.The modified part is as follows.
  ...
task:
  _name: audio_finetuning
  data: C:\Users\18310\Desktop\py\feature-extraction2\trans //only dict.ltr.txt there
  normalize: false
  labels: ltr
model:
  _name: wav2vec_ctc
  w2v_path: C:\Users\18310\Desktop\py\feature-extraction2\model\wav2vec_small_10m.pt
  apply_mask: true
  ...
  1. Run cmd fairseq-hydra-train distributed_training.distributed_world_size=1 --config-dir C:\Users\18310\Desktop\py\feature-extraction2\config\finetuning --config-name base_100h .
  2. See error as above.

Expected behavior

fine-tune the wav2vec model.

Environment

  • fairseq Version: 0.12.2
  • PyTorch Version: 1.8.2
  • OS: Windows
  • Python version:3.8
  • CUDA/cuDNN version:11.1

Additional context

Well, I have another question. As written in the README.md, Fine-tuning a model requires parallel audio and labels file, as well as a vocabulary file in fairseq format, but why does the command line given only include the vocabulary file?

L1xxer avatar Oct 20 '24 12:10 L1xxer

我在跑wav2vec-u的时候,如果使用wav2vec2 的高级模型就会有这个问题,模型换成普通的就没事了

XR1988 avatar Dec 09 '24 07:12 XR1988

请问微调的数据集格式应该是什么样的

lyh187795 avatar Jul 31 '25 07:07 lyh187795