g-h-chen

Results 7 issues of g-h-chen

Hi haotian ``` key | cand/anchor | anchor | cand all 81.0 83.7 67.8 llava_bench_complex 88.4 85.0 75.2 llava_bench_conv 77.0 80.6 62.1 llava_bench_detail 71.3 84.7 60.3 ``` Here is the...

### Discussion Hi Haotian, We recently released [ALLaVA-4V](https://huggingface.co/datasets/FreedomIntelligence/ALLaVA-4V), a large dataset with fine-grained caption and complex reasoning QA pairs. Inclusion of our data can significantly boost model performance on reasoning...

Thanks for your great work! LLaMA-VID supports single-image input and video input, but does it support multi-image input? What's the quickest way to adapt to this input? Thanks in advance!

Hi! It's just a kind reminder that there is a typo of the github link on huggingface. Please check and update.

Hii, this is a really comprehensive work. Can you add our recent work to your survey? [CMB: A Comprehensive Medical Benchmark in Chinese](https://arxiv.org/abs/2308.08833) **Thanks**

Thanks for your great work! In **Multimodal MobiLlama** of the **Results** section, you briefly introduce how you developed MobiLlama-V. The model seems to have a LLaVA-like architecture, but is only...