fairseq copied to clipboard
[Wav2Vec2] Cannot load newly added Wav2Vec2 checkpoints
🐛 Bug
A recent commit: https://github.com/pytorch/fairseq/commit/2513524a1604dbafcc4ea9cc5a99ae0aa4f19694 added two new fine-tuned Wav2Vec2 checkpoints, however it seems like there is a problem with the saved config as one cannot load those checkpoints. E.g. the following code cannot be run:
import fairseq
model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task([checkpoint_path], arg_overrides={"data": "path/to/dict"})
To Reproduce
The following colab reproduces the error (one just has to run all cells): https://colab.research.google.com/drive/13hJI4w8pOD33hxOJ_qwKkN9QqdKVH5IM?usp=sharing
Kindly pinging @alexeib here :-)
@patrickvonplaten I think this fixed on master branch I changed the install line to
!pip install git+https://github.com/pytorch/fairseq.git@master
Still got the same problem ;-)
See https://colab.research.google.com/drive/13hJI4w8pOD33hxOJ_qwKkN9QqdKVH5IM?usp=sharing
@patrickvonplaten a notebook with a working example also, I removed
@patrickvonplaten Hi, I met the same problem. Do you have any solution? Thank you. I run the code:
model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])
I got the error:
ConfigKeyError: Key 'target_dict' not in 'AudioPretrainingConfig' full_key: target_dict reference_type=Optional[AudioPretrainingConfig] object_type=AudioPretrainingConfig
Bump. got similar error.
Both target_dict
and eval_wer
does not make sense to be in AudioPretrainingConfig, as pretraining does not use text. Has someone found the solution?
omegaconf.errors.ConfigKeyError: Key 'eval_wer' not in 'AudioPretrainingConfig'
full_key: eval_wer
We have the robust models otherwise also on the HF Hub: https://huggingface.co/models?arxiv=arxiv:2104.01027 if you're interested
I am still getting this error
ConfigKeyError: Key 'eval_wer' not in 'AudioPretrainingConfig'
full_key: eval_wer
while running the code
import torch
import fairseq
cp_path = '../w2v_large_lv_fsh_swbd_cv_ftsb300_updated.pt'
model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])
model = model[0]
Package Version Location
---------------------- --------------- --------------------------------------------
antlr4-python3-runtime 4.8
backcall 0.2.0
bitarray 2.3.7
certifi 2021.10.8
cffi 1.15.0
colorama 0.4.4
Cython 0.29.28
debugpy 1.5.1
decorator 5.1.1
entrypoints 0.3
fairseq 1.0.0a0+5175fd5
hydra-core 1.0.7
ipykernel 6.4.1
ipython 7.31.1
ipython-genutils 0.2.0
jedi 0.18.1
jupyter-client 7.1.2
jupyter-core 4.9.1
matplotlib-inline 0.1.2
nest-asyncio 1.5.1
numpy 1.22.2
omegaconf 2.0.6
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
pip 21.2.4
portalocker 2.4.0
prompt-toolkit 3.0.20
protobuf 3.19.4
ptyprocess 0.7.0
pycparser 2.21
Pygments 2.11.2
python-dateutil 2.8.2
PyYAML 6.0
pyzmq 22.3.0
regex 2022.1.18
sacrebleu 2.0.0
setuptools 58.0.4
six 1.16.0
tabulate 0.8.9
tensorboardX 2.5
torch 1.10.2
torchaudio 0.10.2
tornado 6.1
tqdm 4.62.3
traitlets 5.1.1
typing_extensions 4.1.1
wcwidth 0.2.5
wheel 0.37.1
I am still getting this error
ConfigKeyError: Key 'eval_wer' not in 'AudioPretrainingConfig' full_key: eval_wer reference_type=Optional[AudioPretrainingConfig] object_type=AudioPretrainingConfig
while running the code
import torch import fairseq cp_path = '../w2v_large_lv_fsh_swbd_cv_ftsb300_updated.pt' model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path]) model = model[0] model.eval()
Package Version Location ---------------------- --------------- -------------------------------------------- antlr4-python3-runtime 4.8 backcall 0.2.0 bitarray 2.3.7 certifi 2021.10.8 cffi 1.15.0 colorama 0.4.4 Cython 0.29.28 debugpy 1.5.1 decorator 5.1.1 entrypoints 0.3 fairseq 1.0.0a0+5175fd5 hydra-core 1.0.7 ipykernel 6.4.1 ipython 7.31.1 ipython-genutils 0.2.0 jedi 0.18.1 jupyter-client 7.1.2 jupyter-core 4.9.1 matplotlib-inline 0.1.2 nest-asyncio 1.5.1 numpy 1.22.2 omegaconf 2.0.6 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 pip 21.2.4 portalocker 2.4.0 prompt-toolkit 3.0.20 protobuf 3.19.4 ptyprocess 0.7.0 pycparser 2.21 Pygments 2.11.2 python-dateutil 2.8.2 PyYAML 6.0 pyzmq 22.3.0 regex 2022.1.18 sacrebleu 2.0.0 setuptools 58.0.4 six 1.16.0 tabulate 0.8.9 tensorboardX 2.5 torch 1.10.2 torchaudio 0.10.2 tornado 6.1 tqdm 4.62.3 traitlets 5.1.1 typing_extensions 4.1.1 wcwidth 0.2.5 wheel 0.37.1
same problem, Have you solved it?
You can solve this by cloning the repo, and then just copying all those missing parameters from the audio fine-tuning config into the audio pretraining config
You can solve this by cloning the repo, and then just copying all those missing parameters from the audio fine-tuning config into the audio pretraining config
did not get this...Shouldn't the config be included in the checkpoints?
@patrickvonplaten Hi, I think I got it. I copied the parameters \fairseq\fairseq\tasks\audio_finetuning.py (AudioFinetuningConfig) to audio_pretraining.py (AudioPretrainingConfig). Is that what you meant?
One more question, it seems the frame size is 20ms, is that the case? I found nowhere the config is.
Thank you a lot!
@patrickvonplaten , I have updated AudioPretrainingConfig with class attributes AudioFinetuningTask which are not inside AudioPretrainingConfig which are causing error. But now new attribute is missing which is target_dict and showing the error
ConfigKeyError: Key 'target_dict' not in 'AudioPretrainingConfig'
full_key: target_dict
I have searched it and its been used in examples/wav2vec/unsupervised/models/wav2vec_u.py and target_dict has been passed in inti but where I can pass or create class attribute in AudioPretrainingConfig
+1 for eval_wer
/ AudioPretrainingConfig
issue using wav2vec2.
@keunwoochoi eval_wer
is not available in pretraining task, override the task into finetuning, like this:
model_override_rules = ast.literal_eval(args.model_overrides)
model_override_rules['task'] = {'_name': 'audio_finetuning'}
models, saved_cfg, task = checkpoint_utils.load_model_ensemble_and_task(
you can load the checkpoint now but it appears the checkpoint has the wrong task (probably created before audio_finetuning was split off from audio_pretraining), and the remedy there is to update the task just as @effendijohanes pointed out. i will try to update the checkpoints at some point in the future