LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

slow fast features not being used in current code

Open sam-motamed opened this issue 7 months ago • 2 comments

Hi, I notice that you have commented out encode_multimodals (https://github.com/LLaVA-VL/LLaVA-NeXT/blob/09e5840d5589ad2d6a8656c0a60f21ae134b3309/llava/model/llava_arch.py#L291C32-L291C55). If I understand correctly, using slow-fast features would require using self.encode_multimodals and not self.encode_images. Could you clarify this?

sam-motamed avatar Apr 18 '25 13:04 sam-motamed

i also found this problem. self.encode_multimodals is not called ?could you clarify this?

huajinghua avatar Apr 22 '25 02:04 huajinghua

They replied on another thread. They don't use the slow fast pooling on the 7B model, only on the 72B.

sam-motamed avatar Apr 22 '25 04:04 sam-motamed