LearningToCut icon indicating copy to clipboard operation
LearningToCut copied to clipboard

inference for a video

Open segalinc opened this issue 3 years ago • 4 comments

Hi, thanks for sharing your work, How can we use it to only extract shot boundaries from a video? I already extracted both video and audio features, but I see the code is only set up for evaluation and not very clear how to use it only for inference Thank you

segalinc avatar Jul 26 '22 20:07 segalinc

Hi, The code assumes that the shot boundaries are already given. We extracted them with an internal tool. However, I have some colleagues who have tried to use TransNetV2, which works pretty well. If you only need the shot boundaries, I recommend using the abovementioned tool. I can provide the inference code, which takes as inputs video and audio features from two different shots. The output of it will be a ranking of the possible places to make transitions between those two shots. Please let me know anything I can help with.

PardoAlejo avatar Jul 27 '22 08:07 PardoAlejo

Hi, thank for your quick reply. I am already evaluating TransNetV2 but also wanted to evaluate how your algo compares to ours as further evaluation. I was able to extract both video and audio features from the video I want to split in shots. If you could provide the inference part of your algo would be great so we can add it to the comparison as one of the latest published worked on the topic

Thank you

segalinc avatar Jul 27 '22 17:07 segalinc

Just to clarify, our algorithm will give you a cut-plausibility ranking given a pair of independent shots. This ranking will give the best places to join these shots, but it assumes that the shots are already separated. In other words, it won't give shot boundaries but will use the shot boundaries as input to separate the shots and find the best places to cut and join such a pair of shots.

After clarifying this, I will work on the inference code over the weekend. The input of the inference can be one of these two options:

  • Two separated shots, i.e. wo videos without cuts (transitions) in them.
  • A single video with two shots and their corresponding shot boundary information.

It will output the best time to cut on both input shots. I'll notify you once it's ready.

PardoAlejo avatar Jul 28 '22 13:07 PardoAlejo

Thanks for clarifying this. I guess then is a different use case we want to compare which is shot boundary detection But I think this will still be useful for another work we want to compare with now that I think about it. This is great! Thank you. No rush to work on this during the weekend :)

segalinc avatar Jul 28 '22 15:07 segalinc