Video-LLaVA Support multiple rounds of video conversations?

Great work! As video conversations in the instruction dataset have only one round in this version, if I want to train and test multiple rounds of video conversions, what should I do? Thanks!

Jan 23 '24 03:01 JiaweiZhao-git

Simply just need to organize the multi-round conversation data in the format of llava_image_tune_.json. llava_image_tune_.json has examples of multi-round conversations in it, even though it is images.

For the dataset source you can use VideoChat.

Jan 23 '24 14:01 LinB203

Does this repo support inference and evaluation of multiple rounds of video conversations currently? Which file should I refer?

Jan 25 '24 02:01 JiaweiZhao-git

Does this repo support inference and evaluation of multiple rounds of video conversations currently? Which file should I refer?

You can refer to this. But I'm not sure the second output of the model is useful.

Jan 25 '24 06:01 LinB203

Simply just need to organize the multi-round conversation data in the format of llava_image_tune_.json. llava_image_tune_.json has examples of multi-round conversations in it, even though it is images.

For the dataset source you can use VideoChat.

where can I get llava_image_tune_.json? this file is not contained in datasets

Mar 26 '24 15:03 silence143