Yongshuo Zong comments

Results 10 comments of


                                            Yongshuo Zong

How are models that use in-context examples handled?

Hi, we developed an ICL benchmark for VLLMs here: https://github.com/ys-zong/VL-ICL. Welcome to try it out. @kennymckormick I wonder if you have a plan to integrate VL-ICL into this very useful...

[Usage] How can I implemet few shot learning on LLaVa

Hi guys, you can use our implemented codebase for ICL. https://github.com/ys-zong/VL-ICL

How to do few-shot in-context learning with BLIP2/InstructBLIP?

FYI. I didn't find a neat way for few-shot BLIP, but I implemented the few-shot inference of many other V-L models here: https://github.com/ys-zong/VL-ICL

Good job. Will the finetune codes be open?

Hi, thanks for your interest. We used the exactly same fine-tuning script from the original LLaVA (https://github.com/haotian-liu/LLaVA/blob/main/scripts/v1_5/finetune.sh) and MiniGPT-4 (https://github.com/Vision-CAIR/MiniGPT-4/blob/main/MiniGPTv2_Train.md). For example, for LLaVA fine-tuning, you can first convert our...

Evaluation code for VQAv2

Thanks for your reply! Yes, I have cast all the outputs to lowercase. > "truncate the model output to the length of the longest ground truth answer" Does the "longest...

Evaluation code for VQAv2

Great! Now I can get the Acc of 27.5% after truncation. Thanks a lot for the help! Still would like to check your implementation to see the last minor difference...

Clarification on Dataset Name for OL3I Opportunistic L3 CT Slices

HI, I perhaps didn't put the OL3I in the arguments as this dataset was included later. You can feel free to add it yourself and the implementation should be very...

Clarification on Dataset Name for OL3I Opportunistic L3 CT Slices

This should be also straightforward to implement according to the paper and other pre-processing scripts. Let me know if you have any problems.

[Expect complete evaluation code] Cannot reproduce the results for AdvBench

From a quick look at your code, it seems you didn't use the LLaVA [conversation template](https://github.com/ys-zong/VLGuard/blob/d889f8d04808635aad63148def0e46e4beb87afc/utils/model_utils.py#L17) but directly input the raw texts, which may cause the differences. Can you modify...

[Question] In-Context-Learning for Batch Inference 上下文学习怎么批量推理？

You could use our implemented codebase for in-context learning. https://github.com/ys-zong/VL-ICL