Samuel

Results 29 comments of Samuel

Do you think pyscenedetect should work well for shot detection (rather than scene detection) under content aware mode?

@Breakthrough Thanks - I figured that was the meaning but I've noticed that sometimes the two are considered distinct (e.g. in this explanation http://production.4filmmaking.com/cinematography1.html). In case it's of interest to...

@Breakthrough That sounds like an appropriate use case - I'll add a test that includes commercials.

Hi @zouying-sjtu, to extract the features we used pretrained models released by the following codebases: * [pretrainedmodels](https://github.com/Cadene/pretrained-models.pytorch) for the squeeze-and-excitation features (pytorch) * [Face models](https://github.com/WeidiXie/Keras-VGGFace2-ResNet50) trained on VGG-Face2 (keras) *...

Hi @zouying-sjtu, yes those are both correct! Sorry for lack of clarity.

Hi @escorciav, We have a copy of the raw features (i.e. extracted densely from each frame), but in most settings we use aggregated versions (and these were what we uploaded)...

Hi @xixiareone, do you have a pointer to that sentence (e.g. in which section in the article it appears)? Thanks! For reference, in our setting, we train by randomly sampling...

No worries! In the test phase, all sentences are used (independently). The evaluation we use was based on the protocol used here: https://github.com/niluthpol/multimodal_vtt

I agree it's confusing! I've summarised below my understanding of the evaluation protocols for MSVD. ### Design choices There are two choices to be made for datasets (like MSVD) that...

Hi @xixiareone, 81 is the maximum number of sentences for a single video used in MSVD. For efficiency, we compute a similarity matrix with a fixed number of sentences per...