transformers the script convert_wav2vec2_original_pytorch_checkpoint_to_pytorch.py does not work on fairseq wav2vec2-xls-r fine-tuned model

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

transformers version: 4.39.0.dev0
Platform: Linux-6.5.0-18-generic-x86_64-with-glibc2.35
Python version: 3.9.12
Huggingface_hub version: 0.19.3
Safetensors version: 0.4.2
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.2.0+cu121 (False)
Tensorflow version (GPU?): 2.11.0 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help?

@sanchit-gandhi

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

on bash: python3 /home/monica/PycharmProjects/convert_HF/transformers/src/transformers/models/wav2vec2/convert_wav2vec2_original_pytorch_checkpoint_to_pytorch.py --pytorch_dump_folder_path guarani/output --checkpoint_path /home/monica/PycharmProjects/convert_HF/guarani/guarani_model.pt --config_path /home/monica/PycharmProjects/convert_HF/guarani/config.json --dict_path /home/monica/PycharmProjects/convert_HF/guarani/dict.ltr.txt

Then this is the error: Traceback (most recent call last): File "/home/monica/PycharmProjects/convert_HF/transformers/src/transformers/models/wav2vec2/convert_wav2vec2_original_pytorch_checkpoint_to_pytorch.py", line 364, in <module> convert_wav2vec2_checkpoint( File "/home/monica/anaconda3/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/monica/PycharmProjects/convert_HF/transformers/src/transformers/models/wav2vec2/convert_wav2vec2_original_pytorch_checkpoint_to_pytorch.py", line 331, in convert_wav2vec2_checkpoint model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task( File "/home/monica/anaconda3/lib/python3.9/site-packages/fairseq/checkpoint_utils.py", line 436, in load_model_ensemble_and_task task = tasks.setup_task(cfg.task) File "/home/monica/anaconda3/lib/python3.9/site-packages/fairseq/tasks/__init__.py", line 39, in setup_task cfg = merge_with_parent(dc(), cfg) File "/home/monica/anaconda3/lib/python3.9/site-packages/fairseq/dataclass/utils.py", line 500, in merge_with_parent merged_cfg = OmegaConf.merge(dc, cfg) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/omegaconf.py", line 321, in merge target.merge_with(*others[1:]) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 331, in merge_with self._format_and_raise(key=None, value=None, cause=e) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/base.py", line 95, in _format_and_raise format_and_raise( File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/_utils.py", line 629, in format_and_raise _raise(ex, cause) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 329, in merge_with self._merge_with(*others) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 347, in _merge_with BaseContainer._map_merge(self, other) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 314, in _map_merge dest[key] = src._get_node(key) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 258, in __setitem__ self._format_and_raise( File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/base.py", line 95, in _format_and_raise format_and_raise( File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/_utils.py", line 629, in format_and_raise _raise(ex, cause) File "/home/monica/anaconda3/lib/python3.9/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace omegaconf.errors.ConfigKeyError: Key 'multiple_train_files' not in 'AudioFinetuningConfig' full_key: multiple_train_files reference_type=Optional[AudioFinetuningConfig] object_type=AudioFinetuningConfig

Expected behavior

convert fairseq fine-tuned model checkpoint (.pt) to the Huggingface format without errors

Feb 21 '24 21:02 monirome

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Mar 23 '24 08:03 github-actions[bot]

Hi! @sanchit-gandhi could you help me with this issue?

Thank you very much!

Mar 23 '24 15:03 monirome

@patrickvonplaten could you help me with this issue?

Thank you very much!

Apr 08 '24 14:04 monirome

Gentle ping @sanchit-gandhi @ylacombe

Apr 08 '24 14:04 amyeroberts

Another ping @sanchit-gandhi @ylacombe

May 07 '24 10:05 amyeroberts

Hey @monirome, thanks for opening this issue! seems like it might be related with your checkpoint, do you think you could send a code snippet and a checkpoint that would allow us to reproduce the issue ?
Thanks!

May 13 '24 15:05 ylacombe

Hi everyone, I tried the script and got ValueError: Shape of hf is torch.Size([768]), but should be torch.Size([1024]) for w2v_encoder.w2v_model.mask_emb. I had to remove the keys like multiple_train_files, eval_wer, eval_wer_config etc. (one of which is mentioned by @monirome) before fine-tuning as they were not present in AudioFinetuningConfig

May 24 '24 15:05 bhavitvyamalik

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jun 18 '24 08:06 github-actions[bot]

Gentle nudge to @ylacombe and @sanchit-gandhi

Jun 19 '24 14:06 bhavitvyamalik

Hey @bhavitvyamalik, could you share a reproducible snippet and your model (or a dummy model)? It's difficult to reproduce if I don't have your checkpoint, dict and config! Thanks!

Jun 19 '24 14:06 ylacombe

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jul 14 '24 08:07 github-actions[bot]