contextual-utterance-level-multimodal-sentiment-analysis
contextual-utterance-level-multimodal-sentiment-analysis copied to clipboard
features dimensions
Hi,
How did you get just a row for each utterance in the visual and text sections (according to the csv files)? is it the average of all the frames/words per an utterance ?