Visual_Speech_Recognition_for_Multiple_Languages
Visual_Speech_Recognition_for_Multiple_Languages copied to clipboard
Version issues
Can you please tell me the life version of pytorch you are using, I have some errors with the 2.0 version. Thank you!
It works with pytorch 1.13.1, but the result is very bad........
Hi @jaydenjudith, can you please provide more information about the errors you're experiencing? I tested the 2.0.0 version and it worked for me. @matiter, can you please clarify which result you were referring to?
Hi, @matiter @jayden-leo, Dr.Ma @mpc001 is right. The version of pytorch (2.0.0) is the key point. When I use pytorch==1.13, only 6.0% WER for SOTA AVSR, but for version 2.0.0 the results align with the paper well. By the way, make sure to use torchaudio just like the original code to extract audio from mp4, other packages like Vidoeclip may cause different results.