LLaVA-UHD-Better
LLaVA-UHD-Better copied to clipboard
Attention mask的计算?
https://github.com/ParadoxZW/LLaVA-UHD-Better/blob/main/llava_uhd/adapt_llava.py#L136-L138
这里由于The first token is for CLS,是不是需要把
m[:w * h] = True
改成
m[:w * h+1] = True