LLaVA-NeXT
LLaVA-NeXT copied to clipboard
1.5阶段不是加入了中文ocr数据么,为什么识别中文依旧没有任何效果
When I run `bash scripts/video/demo/video_demo.sh ${the path of LLaVA-NeXT-Video-7B-DPO} vicuna_v1 32 2 True ${the path of video}` I get the error ``` Can't set vocab_size with value 32000 for LlavaConfig...
Hi LLaVA-NeXT team, Will there be official support for llava-hf versions for the new LLaVA-NeXT (2024-05 Release) models soon?
The best thing to do would be to author correct chat templates in both your repository's huggingface models and the official huggingface ones. It should also interact with the AutoProcessor...
the output of output_ids is tensor([[1, 2]], device='cuda:0') Other output of the demo script is: Question: A chat between a curious user and an artificial intelligence assistant. The assistant gives...
Does it mean that you have trained all visual encoders during the fine-tuning and what are the specific training settings
Greetings. I noticed that you have released LLaVA-Next with S^2 finetuned checkpoint, but I cannot find any descriptions or benchmarks related to this repo. Is there any description / benchmark...
we use the LLaVA-NeXT-Video-DPO (34B)