wenet self-supervised pretraining(wav2vec 2.0/data2vec) for wenet

1.support self-supervised pretraining using wav2vec 2.0/data2vec method 2.add ssl recipe in librispeech/ssl 3.add ssl recipe in aishell/ssl 截屏2021-11-16 上午11 36 43

Mar 31 '22 09:03 Emiyassstar

cool!

Mar 31 '22 11:03 misaka23

nice

Apr 02 '22 03:04 Yymax-max

Looking forward to the latest developments

Apr 08 '22 02:04 liufei1656

1.support self-supervised pretraining using wav2vec 2.0/data2vec method 2.add ssl recipe in librispeech/ssl 3.add ssl recipe in aishell/ssl

我尝试复现这个例子，使用https://huggingface.co/emiyasstar/ch-w2v-conformer 这个预训练模型，报错如下： Traceback (most recent call last): File "wenet/bin/train.py", line 322, in main() File "wenet/bin/train.py", line 234, in main infos = load_trained_modules(model, args) File "/home/wenet_ssl/wenet/utils/checkpoint.py", line 95, in load_trained_modules model.load_state_dict(main_state_dict) File "/home/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2vec2Model: Unexpected key(s) in state_dict: "encoder.embed.linear.weight", "encoder.embed.linear.bias". size mismatch for encoder.embed.conv.2.weight: copying a param with shape torch.Size([512, 512, 5, 5]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). 我该如何修改？希望得到您的回复。

Apr 03 '23 09:04 rookie0607

我尝试复现这个例子，使用https://huggingface.co/emiyasstar/ch-w2v-conformer 这个预训练模型，报错如下： Traceback (most recent call last): File "wenet/bin/train.py", line 322, in main() File "wenet/bin/train.py", line 234, in main infos = load_trained_modules(model, args) File "/home/wenet_ssl/wenet/utils/checkpoint.py", line 95, in load_trained_modules model.load_state_dict(main_state_dict) File "/home/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2vec2Model: Unexpected key(s) in state_dict: "encoder.embed.linear.weight", "encoder.embed.linear.bias". size mismatch for encoder.embed.conv.2.weight: copying a param with shape torch.Size([512, 512, 5, 5]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). 我该如何修改？希望得到您的回复。

ch-w2v-conformer使用的是6倍降采样模型，并且去除了预训练部分的训练参数以兼容master分支代码，你可以配合我们放出的 openasr recipe 里面提供的配置文件去加载模型 https://github.com/wenet-e2e/wenet/tree/main/examples/openasr2021/s0

Apr 21 '23 08:04 Emiyassstar

我尝试复现这个例子，使用https://huggingface.co/emiyasstar/ch-w2v-conformer 这个预训练模型，报错如下： Traceback (most recent call last): File "wenet/bin/train.py", line 322, in main() File "wenet/bin/train.py", line 234, in main infos = load_trained_modules(model, args) File "/home/wenet_ssl/wenet/utils/checkpoint.py", line 95, in load_trained_modules model.load_state_dict(main_state_dict) File "/home/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Wav2vec2Model: Unexpected key(s) in state_dict: "encoder.embed.linear.weight", "encoder.embed.linear.bias". size mismatch for encoder.embed.conv.2.weight: copying a param with shape torch.Size([512, 512, 5, 5]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). 我该如何修改？希望得到您的回复。

ch-w2v-conformer使用的是6倍降采样模型，并且去除了预训练部分的训练参数以兼容master分支代码，你可以配合我们放出的 openasr recipe 里面提供的配置文件去加载模型 https://github.com/wenet-e2e/wenet/tree/main/examples/openasr2021/s0

感谢您的回复，https://github.com/wenet-e2e/wenet/blob/1269a6e5bbec440302e934f243f623baeebf2758/examples/aishell/s0_ssl/README.md 提到的使用fbank作为特征输入所训练的w2v-conformer 模型开源了吗？

Apr 25 '23 03:04 rookie0607

wenet wenet copied to clipboard

self-supervised pretraining(wav2vec 2.0/data2vec) for wenet

wenet
wenet copied to clipboard