LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

Query about the dimension of outputs.attentions

Open sterzhang opened this issue 5 months ago • 7 comments

Does anyone know why the shape of outputs.attentions[0][-1] is [1, 754, 28, 28]

754 is the total number of token of inputs and current outputs,

I wonder what's 28, 28 here for?

sterzhang avatar Sep 09 '24 11:09 sterzhang