InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

train 1B model on 32G V100 gpu ,flash_attention not support, any one train 1B model on V100? A100 cost expensive

Open CLIsVeryOK opened this issue 1 year ago • 4 comments

addition: My task is an easy single picture classify, I find 1B model outperform Clip by a large margin, so wants to train 1B model on V100

CLIsVeryOK avatar Dec 23 '24 08:12 CLIsVeryOK

you can set _attn_implementation to eager in the config to disable flash attn.

Weiyun1025 avatar Dec 23 '24 10:12 Weiyun1025

您的来信已收到,谢谢!陈雷同济大学测绘与地理信息学院Thanks for your attention.Chen Lei College of survey and geo-information of Tongji university

CLIsVeryOK avatar Dec 23 '24 10:12 CLIsVeryOK

Hello @Weiyun1025 Thanks for your reply.

after this change I am still getting error RuntimeError: FlashAttention only supports Ampere GPUs or newer. For your reference I shared my current notebook If you can help me (https://github.com/kachhadiyaraj15/internvl_testing/blob/main/inrternvl_2_5_flash_attention_error.ipynb]

kachhadiyaraj15 avatar Feb 21 '25 18:02 kachhadiyaraj15

您的来信已收到,谢谢!陈雷同济大学测绘与地理信息学院Thanks for your attention.Chen Lei College of survey and geo-information of Tongji university

CLIsVeryOK avatar Feb 21 '25 18:02 CLIsVeryOK

您的来信已收到,谢谢!陈雷同济大学测绘与地理信息学院Thanks for your attention.Chen Lei College of survey and geo-information of Tongji university

请问最后解决了吗?使用eager模式时,attention mask的维度应该不匹配吧?需要修改dataset的collate

GUOhm230 avatar Aug 27 '25 01:08 GUOhm230