wenet icon indicating copy to clipboard operation
wenet copied to clipboard

self-supervised pretraining(wav2vec 2.0/data2vec) for wenet

Open Emiyassstar opened this issue 2 years ago • 6 comments

1.support self-supervised pretraining using wav2vec 2.0/data2vec method 2.add ssl recipe in librispeech/ssl 3.add ssl recipe in aishell/ssl 截屏2021-11-16 上午11 36 43

Emiyassstar avatar Mar 31 '22 09:03 Emiyassstar

cool!

misaka23 avatar Mar 31 '22 11:03 misaka23

nice

Yymax-max avatar Apr 02 '22 03:04 Yymax-max

Looking forward to the latest developments

liufei1656 avatar Apr 08 '22 02:04 liufei1656

1.support self-supervised pretraining using wav2vec 2.0/data2vec method 2.add ssl recipe in librispeech/ssl 3.add ssl recipe in aishell/ssl 截屏2021-11-16 上午11 36 43

我尝试复现这个例子,使用https://huggingface.co/emiyasstar/ch-w2v-conformer 这个预训练模型,报错如下: Traceback (most recent call last): File "wenet/bin/train.py", line 322, in main() File "wenet/bin/train.py", line 234, in main infos = load_trained_modules(model, args) File "/home/wenet_ssl/wenet/utils/checkpoint.py", line 95, in load_trained_modules model.load_state_dict(main_state_dict) File "/home/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2vec2Model: Unexpected key(s) in state_dict: "encoder.embed.linear.weight", "encoder.embed.linear.bias". size mismatch for encoder.embed.conv.2.weight: copying a param with shape torch.Size([512, 512, 5, 5]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). 我该如何修改? 希望得到您的回复。

rookie0607 avatar Apr 03 '23 09:04 rookie0607

我尝试复现这个例子,使用https://huggingface.co/emiyasstar/ch-w2v-conformer 这个预训练模型,报错如下: Traceback (most recent call last): File "wenet/bin/train.py", line 322, in main() File "wenet/bin/train.py", line 234, in main infos = load_trained_modules(model, args) File "/home/wenet_ssl/wenet/utils/checkpoint.py", line 95, in load_trained_modules model.load_state_dict(main_state_dict) File "/home/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2vec2Model: Unexpected key(s) in state_dict: "encoder.embed.linear.weight", "encoder.embed.linear.bias". size mismatch for encoder.embed.conv.2.weight: copying a param with shape torch.Size([512, 512, 5, 5]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). 我该如何修改? 希望得到您的回复。

ch-w2v-conformer使用的是6倍降采样模型,并且去除了预训练部分的训练参数以兼容master分支代码,你可以配合我们放出的 openasr recipe 里面提供的配置文件去加载模型 https://github.com/wenet-e2e/wenet/tree/main/examples/openasr2021/s0

Emiyassstar avatar Apr 21 '23 08:04 Emiyassstar

我尝试复现这个例子,使用https://huggingface.co/emiyasstar/ch-w2v-conformer 这个预训练模型,报错如下: Traceback (most recent call last): File "wenet/bin/train.py", line 322, in main() File "wenet/bin/train.py", line 234, in main infos = load_trained_modules(model, args) File "/home/wenet_ssl/wenet/utils/checkpoint.py", line 95, in load_trained_modules model.load_state_dict(main_state_dict) File "/home/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2vec2Model: Unexpected key(s) in state_dict: "encoder.embed.linear.weight", "encoder.embed.linear.bias". size mismatch for encoder.embed.conv.2.weight: copying a param with shape torch.Size([512, 512, 5, 5]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). 我该如何修改? 希望得到您的回复。

ch-w2v-conformer使用的是6倍降采样模型,并且去除了预训练部分的训练参数以兼容master分支代码,你可以配合我们放出的 openasr recipe 里面提供的配置文件去加载模型 https://github.com/wenet-e2e/wenet/tree/main/examples/openasr2021/s0

感谢您的回复,https://github.com/wenet-e2e/wenet/blob/1269a6e5bbec440302e934f243f623baeebf2758/examples/aishell/s0_ssl/README.md 提到的使用fbank作为特征输入所训练的w2v-conformer 模型开源了吗?

rookie0607 avatar Apr 25 '23 03:04 rookie0607