Joint-Cross-Attention-for-Audio-Visual-Fusion
Joint-Cross-Attention-for-Audio-Visual-Fusion copied to clipboard
Is it alright to use the cam model from the Orig_cam file?
Hi, I want to study your code for audio-visual emotion recognition tasks.
In the main file, the cam model from the orig_cam file doesn't match what is described in your paper. On the other hand the cam model in your cam file seems to match the description involving calculate w_h etc..
Is it alright to use the cam model from the Orig_cam file?