LipFD
LipFD copied to clipboard
Have a question about the article
While reading your article, I have a question about how your audio information is embedded in it, as I understand it. The Global Feature Encoder input is just Image, not the spectrogram. Can you ask about the specific processing operation of the input. Thanks!