Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

caption_llava.py output empty information

Open DwanZhang-AI opened this issue 1 year ago • 2 comments

When using liuhaotian/llava-v1.6-vicuna-7b or liuhaotian/llava-v1.6-mistral-7b for video caption, there is a very high chance to output empty information. Changing temperature from 0.2 to 0.8 helps with the problem, but still, the model cannot deal with all the videos.

DwanZhang-AI avatar Mar 21 '24 19:03 DwanZhang-AI

In our experiments, we only use the 34b model as the 7b model's ability is poor. Thus, we do not test the 7b's quality. I am concerned that the 7b's quality cannot satisfied video generation requirement.

zhengzangw avatar Mar 25 '24 05:03 zhengzangw

me too, output.csv empty information. use the llava-v1.6-vicuna-7b,

mysuns avatar Mar 25 '24 10:03 mysuns

me too, output.csv empty information. use the llava-v1.6-vicuna-7b,

The input token length exceeds the maximum token length of the 7b model.

DwanZhang-AI avatar Mar 26 '24 02:03 DwanZhang-AI

me too, output.csv empty information. use the llava-v1.6-vicuna-7b,

The input token length exceeds the maximum token length of the 7b model.

How can I fix this problem?

KihongK avatar Jul 24 '24 05:07 KihongK