Knownothing
Knownothing
@whysirier 我是小白,请问你这个东西看起来有标准的模版啊,直接批量返回匹配不行吗? 为什么全部用大模型?
> Hi [@RocketFlash](https://github.com/RocketFlash) , > > During our training, we did not use the format of frame numbers, so the model may encounter difficulties in understanding video frame indices. We...
> Hi, > > You are right. We input the video by extracting frames into multiple images. Please refer to [the usage example in README.md](https://github.com/OpenGVLab/InternVL/blob/dd3635206874c92386185d586fffeda1026d3a76/README.md?plain=1#L996). You can set the number...