train u2++ conformer : AssertionError assert offset + size <= self.max_len
When training u2++ conformer on custom dataset, I'm encountering this error, it stops after training a while. The train_conformer.yaml worked fine.
/wenet/wenet/transformer/embedding.py", line 102, in position_encoding
assert offset + size <= self.max_len
AssertionError
To Reproduce Steps to reproduce the behavior:
- create custom dataset following librispeech example (uising wav files, instead of flac)
- change config file to
train_u2++_conformer.yamlinrun.sh, the only thing I changed in yaml file is batch size - run stage 4
- See error
Expected behavior Finish training normally
训练数据太长了 , 超过emb的最长长度
Turi Abu @.***> 于2024年9月10日周二 11:16写道:
When training u2++ conformer on custom dataset, I'm encountering this error, it stops after training a while. The train_conformer.yaml worked fine.
/wenet/wenet/transformer/embedding.py", line 102, in position_encoding assert offset + size <= self.max_len AssertionError
To Reproduce Steps to reproduce the behavior:
- create custom dataset following librispeech example (uising wav files, instead of flac)
- change config file to train_u2++_conformer.yaml in run.sh, the only thing I changed in yaml file is batch size
- run stage 4
- See error
Expected behavior Finish training normally
— Reply to this email directly, view it on GitHub https://github.com/wenet-e2e/wenet/issues/2629, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFN3Q6OSQ3DMYCV3ASJOXLZVZQBFAVCNFSM6AAAAABN5ZIWPKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGUYTKMZQGUYTCOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
What is the suggested solution? Same training data works for train_conformer.yaml So it's only for U2++?
Thanks
@Mddct
需要把每一条训练数据限制在30s以内, 或者改大max len
Turi Abu @.***> 于2024年9月10日周二 13:33写道:
What is the suggested solution? @Mddct https://github.com/Mddct
— Reply to this email directly, view it on GitHub https://github.com/wenet-e2e/wenet/issues/2629#issuecomment-2339658575, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFN3Q254WBSGNWS7OU3H23ZV2ADXAVCNFSM6AAAAABN5ZIWPKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZZGY2TQNJXGU . You are receiving this because you were mentioned.Message ID: @.***>