OFA icon indicating copy to clipboard operation
OFA copied to clipboard

can OFA support video language tasks such as video-caption?

Open dinglei8908 opened this issue 3 years ago • 1 comments

suppose we can extract several frames from video, any suggestions about this?

dinglei8908 avatar Oct 12 '22 07:10 dinglei8908

Not done yet, but possible. We still need to figure out if we need to make changes on pretraining, or simply adapt the pretrained models to this task. The simplest way might be treating the average of frames as an image.

JustinLin610 avatar Oct 14 '22 09:10 JustinLin610