MSVA
MSVA copied to clipboard
extract object features
Hi, I am trying to extract the object features using your code in https://github.com/VideoAnalysis/EDUVSUM/tree/master/src.
According to your paper, you are using the googleNet trained with imagenet. I assume that you are extracting features using the model "modelInceptionV3" as in the codes. However, the feature shape of " inceptionv3_feature = modelInceptionV3.predict(frmRz299)" is (8,8, 2048). I tried to change the model initialization code to "modelInceptionV3=InceptionV3(weights='imagenet', pooling='avg', include_top=False)' to get a 2048 feature vector. However, the object feature vector length is 1024 in the MSVA codes, and I noticed that the values of features from the feature extraction code is quite different from that in the MSVA codes. For the former, the feature values can be larger than 1, but in the latter, the value seemed to be normalized to [0,1] range.
Have I missed something?