fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

How to reproduce the speech data2vec result?

Open yangjiabupt opened this issue 1 year ago • 1 comments

I follow the the instructions in page (https://github.com/facebookresearch/fairseq/tree/main/examples/data2vec) to reproduce the speech data2vec result.

And i have got the pretrained model.

Then start to use "fairseq-hydra-train
distributed_training.distributed_port=$PORT
task.data=/path/to/data
model.w2v_path=/path/to/model.pt
--config-dir /path/to/fairseq-py/examples/wav2vec/config/finetuning
--config-name base_100h common.user_dir=examples/data2vec" to finetune.

First of all , fairseq-hydra-train: error: unrecognized arguments: --common.user_dir=examples/data2vec  

Then , I use "fairseq-hydra-train \
task.data=/path/to/data \
model.w2v_path=/path/to/model.pt \
--config-dir /path/to/fairseq-py/examples/wav2vec/config/finetuning \
--config-name base_100h"  to finetune.
However, the error occurs, "File "/home/research/jiayang/data2vector/fairseq/fairseq/tasks/fairseq_task.py", line 338, in build_model
model = models.build_model(cfg, self, from_checkpoint)

File "/home/research/jiayang/data2vector/fairseq/fairseq/models/init.py", line 102, in build_model "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys()) KeyError: "'_name'""

It's really confusing. The pretrain model config setting is data2vec. Then the finetune model config is wav2vec?

Is there some mistakes? Looking for help

yangjiabupt avatar Jun 30 '22 07:06 yangjiabupt

i've fixed same issue by adding

common.user_dir=examples/data2vec

after

task.data=/path/to/data

and before --config

try this one

fairseq-hydra-train
distributed_training.distributed_port=$PORT
task.data=/path/to/data
common.user_dir=examples/data2vec
model.w2v_path=/path/to/model.pt
--config-dir /path/to/fairseq-py/examples/wav2vec/config/finetuning
--config-name base_100h 

Abdullah955 avatar Jul 06 '22 18:07 Abdullah955