VLMEvalKit icon indicating copy to clipboard operation
VLMEvalKit copied to clipboard

Inconsistent max_new_tokens

Open amitbcp opened this issue 1 year ago • 2 comments

Across different LMMs the max new token is different . I believe we should have a consistent MAX_NEW_TOKENS across the project, set to 512 or 1024

If it makes sense, I can create a PR to modify all of them

amitbcp avatar Aug 03 '24 20:08 amitbcp

Hi, @amitbcp , is there any specific cases you are talking about? I think for most VLMs we adopt a MAX_NEW_TOKENS >= 512.

kennymckormick avatar Aug 04 '24 15:08 kennymckormick

@kennymckormick:

For example :

  1. in MiniCPM we have defined max length which we haven't done for models https://github.com/open-compass/VLMEvalKit/blob/22991ca6109c5d4e65bc4a1a9273234d23c3e13f/vlmeval/vlm/minicpm_v.py#L186

  2. For Phi3 its 500 https://github.com/open-compass/VLMEvalKit/blob/22991ca6109c5d4e65bc4a1a9273234d23c3e13f/vlmeval/vlm/phi3_vision.py#L36

  3. For Qwen also we adjust the tokens https://github.com/open-compass/VLMEvalKit/blob/22991ca6109c5d4e65bc4a1a9273234d23c3e13f/vlmeval/vlm/qwen_vl.py#L38

  4. For BunnyLlama https://github.com/open-compass/VLMEvalKit/blob/22991ca6109c5d4e65bc4a1a9273234d23c3e13f/vlmeval/vlm/bunnyllama3.py#L131

  5. For CogVLM we only use 2048 : https://github.com/open-compass/VLMEvalKit/blob/22991ca6109c5d4e65bc4a1a9273234d23c3e13f/vlmeval/vlm/cogvlm.py#L24

and more.

So should we set a consistent length for all models to have them perform equally on similar bases ?

amitbcp avatar Aug 10 '24 02:08 amitbcp