CMU-MultimodalSDK
CMU-MultimodalSDK copied to clipboard
Raw POM labels/data mismatch
Hi there,
There seems to be a mismatch in some of the video labels in the file movie_review_main_HIT_all_icmi.csv in the raw POM dataset and the video file names. Both have 1,000 entries, but 96 videos appear to have no labels and 96 labels appear to have no videos. In the labels file, I'm using the stem of the URL in the column Input.videoLink to map to video file names.
The videos without labels are:
['101281', '101733', '101747', '108573', '110209', '114945', '125698', '158832', '158855', '159031', '160225', '172075', '176541', '17811', '178941', '188334', '189856', '192827', '196364', '213375', '219620', '219627', '220550', '220832', '224238', '224271', '224285', '22884', '229637', '233405', '233431', '233940', '236108', '237385', '238042', '238545', '238597', '238803', '242166', '246117', '246789', '246798', '250343', '251251', '254216', '254488', '259857', '261052', '268127', '270771', '276771', '27880', '27881', '27886', '27888', '28412', '30036', '35692', '36060', '37462', '40367', '43428', '46592', '46808', '46826', '47056', '48219', '49962', '53233', '60308', '60328', '60351', '60402', '60983', '61035', '61247', '68721', '69307', '74115', '74197', '74204', '74268', '76819', '79019', '80640', '91002', '91252', '91284', '93845', '96106', '96377', '96595', '97463', '98513', '98636', '98649']
The labels without videos are:
['100483', '100498', '10177', '10291', '107014', '110929', '119351', '121763', '125657', '126944', '136255', '136369', '159588', '159605', '16146', '16980', '179873', '180865', '18279', '183895', '18588', '188123', '189296', '192798', '194106', '195333', '204451', '205499', '205896', '20915', '210519', '213264', '215712', '227420', '227874', '228105', '230702', '232457', '238712', '241155', '243053', '243888', '24695', '24759', '25639', '256944', '257927', '259159', '262334', '263727', '264076', '266609', '270996', '27775', '28151', '281984', '28335', '284005', '292588', '34637', '36158', '36209', '368969', '370399', '39282', '42682', '45433', '46254', '46441', '50455', '54457', '55506', '56148', '56810', '66261', '71029', '74594', '74998', '77229', '77237', '79701', '81276', '81544', '83106', '83283', '88099', '88139', '89839', '90412', '91350', '92686', '93827', '93976', '9524', '96683', '96685']
Can you help me understand what's going on here? Are there actually missing data/labels?
Hello @jackanth,
Hmm I wonder if this could be a bug on my side. This is the script that generates the sdk file for labels. At a glance, I cannot see anything wrong with it, but I thought sharing would be helpful. If you download the processed tensors from SDK V0, this issue doesn't seem to be there.
import os
from os import listdir
from os.path import isfile, join
from mmsdk import mmdatasdk
import time
import numpy
import csv
top='.../corrected_aligned_POM/'
labelfile='.../movie_review_main_HIT_all_icmi.csv'
files = [f.split('.')[0] for f in listdir(top) if isfile(join(top, f))]
data={}
vid_lens={}
data["persuasion"]={}
data["sentiment"]={}
data["personality"]={}
dimensions=["sentiment","confident","passionate","voice_pleasant", "dominant","credible","vivid","expertise","entertaining","reserved","trusting","lazy","relaxed","outgoing","thorough","nervous","humerous","persuasion"]
def read_labels():
with open(labelfile, 'rb') as csvfile:
pomreader = csv.reader(csvfile, delimiter=',', quotechar='"')
count=-1
for line in pomreader:
count=count+1
if (count ==0):
continue
vid_id=line[33].split('/')[-1].split('.')[-2]
vid_length=float(line[32])
vid_lens[vid_id]=vid_length
confident=float(line[35][0])
passionate=float(line[36][0])
voice_pleasant=float(line[40][0])
dominant=float(line[42][0])
credible=float(line[43][0])
vivid=float(line[44][0])
expertise=float(line[46][0])
entertaining=float(line[47][0])
reserved=float(line[48][0])
trusting=float(line[49][0])
lazy=float(line[50][0])
relaxed=float(line[51][0])
outgoing=float(line[53][0])
thorough=float(line[55][0])
nervous=float(line[57][0])
sentiment=float(line[61][0])
persuasive=float(line[63][0])
why_persuasive=line[64]
humerous=float(line[65][0])
if vid_id not in data["persuasion"]:
data["personality"][vid_id]=[]
data["persuasion"][vid_id]=[]
data["sentiment"][vid_id]=[]
personality_vec=[confident,passionate,voice_pleasant, dominant,credible,vivid,expertise,entertaining,reserved,trusting,lazy,relaxed,outgoing,thorough,nervous,humerous]
persuasion_vec=[persuasive]
sentiment_vec=[sentiment]
data["personality"][vid_id].append(personality_vec)
data["persuasion"][vid_id].append(persuasion_vec)
data["sentiment"][vid_id].append(sentiment_vec)
read_labels()
print ("Labels Did!",len(data['sentiment'].keys()))
data_sent={}
data_persuasion={}
data_personality={}
data_final={}
for f in data["sentiment"].keys():
print(f)
timestamp=numpy.array([0.0,vid_lens[f]])[None,:]
print(timestamp)
features_sent=numpy.array(data["sentiment"][f]).mean(axis=0)[None,:]
features_personality=numpy.array(data["personality"][f]).mean(axis=0)[None,:]
features_persuasion=numpy.array(data["persuasion"][f]).mean(axis=0)[None,:]
intervals=timestamp
if(features_sent.shape[0] != intervals.shape[0]):
print("Different size of intervals and featuers")
print(features_sent.shape,intervals.shape)
time.sleep(1000)
data_final[f]={}
data_final[f]['intervals']=intervals
data_final[f]['features']=numpy.concatenate([features_sent,features_personality,features_persuasion],axis=1)
destination=".../POM_Labels.csd"
rootname="labels"
metadata={}
metadata["root name"]=rootname
metadata["dimension names"]=dimensions
metadata["computational sequence description"]="POM Dataset Labels"
metadata["computational sequence version"]=1.0
metadata["alignment compatible"]=True
metadata["dataset name"]="POM"
metadata["dataset version"]=1.0
metadata["creator"]="Amir Zadeh"
metadata["contact"]="[email protected]"
metadata["featureset bib citation"]="@online{emotient,author = {iMotions},title = {Facial Expression Analysis},year = {2017},url = {goo.gl/1rh1JN}}"
metadata["dataset bib citation"]="@inproceedings{park2014computational,title={Computational analysis of persuasiveness in social multimedia: A novel dataset and multimodal prediction approach},author={Park, Sunghyun and Shim, Han Suk and Chatterjee, Moitreya and Sagae, Kenji and Morency, Louis-Philippe},booktitle={Proceedings of the 16th International Conference on Multimodal Interaction},pages={50--57},year={2014},organization={ACM}}"
labels=mmdatasdk.computational_sequence(rootname)
labels.setData(data_final,rootname)
labels.setMetadata(metadata,rootname)
labels.deploy(destination)
Thanks for the swift response!
I guarantee the data and labels in http://immortal.multicomp.cs.cmu.edu/raw_datasets/POM_RAW.zip do not match. It looks like your script might pull from some other version of the data (corrected_aligned_POM). It's possible to download the missing videos from the original links in the labels csv but it would be great to be able to get the transcripts too.
In 2018, @Justin1904 said on another issue "However, I think I previously talked to you about something similar in POM dataset where there're data points with no labels and labels with no corresponding data points. I wonder if they are due to the same bugs." – perhaps he can shed some light?
@jackanth, it is possible that I am using a new forced alignment output. However, it is unlikely that would change too much. Did your investigation reveal the source of this problem?
Hi @A2Zadeh – thanks for checking in!
POM_RAW.zip does indeed have 96 videos without labels and 96 labels without videos. My guess is that you're not using exactly the contents of this archive on your side, but a slightly different version that has addressed the mismatch. Perhaps that's what corrected means in the directory name corrected_aligned_POM in your script, which otherwise looks fine to me.
If corrected_aligned_POM includes a video called 100483.{mp4/whatever the ext is} and/or other videos from the second list I shared in my original post, then it has diverged from POM_RAW.zip and has probably been corrected for this mismatch. Could you confirm this? If so, it would be great to be able to access the corrected version.
Thanks in advance!