Xuan Dong issues

Repositories
Issues
Comments

Results 3 issues of


                                            Xuan Dong

Cannot reproduce VideoChatGPT generative performance results

Thank you for your contribution! Hello, I'm trying to reproduce the evaluation scores for generative performance in the VideoChatGPT evaluation of model EVA-G & LLaVA1.5-VideoChatGPT-Instruct 7B. I have downloaded your...

How to access video data in LLaVA-OneVision?

Thank you for your contribution. Under the huggingface `lmms-lab/LLaVA-OneVision-Data` repo, I find that there are only single-image data, and in your `scripts/train/README.md`, you say that the video incorporates **Youcook2 (32267),...

How to guide the caption model with the a11y tree information?

Hi, First of all, thanks for your great work on Omniparser V2! After reviewing the code in demo.ipynb, I understand that the workflow of Omniparser V2 involves: 1. Using an...