VidToMe icon indicating copy to clipboard operation
VidToMe copied to clipboard

About efficiency

Open Ree1s opened this issue 9 months ago • 1 comments

Thanks for sharing this fascinating project! I tested the default demo with and without VidTome, and found the VidTome version to be slower. Is it normal?

Ree1s avatar Feb 14 '25 09:02 Ree1s

Yes, it is normal. As we have two-stage token merging and unmerging operations around each self-attention module, they add computation overhead compared to processing each frame separately (w.o. VidToMe).

VidToMe improves efficiency against direct self-attention extension, which jointly processes tokens from all frames in the self-attention modules.

lixirui142 avatar Feb 14 '25 13:02 lixirui142