VidToMe About efficiency

About efficiency

Open Ree1s opened this issue 9 months ago • 1 comments

Thanks for sharing this fascinating project! I tested the default demo with and without VidTome, and found the VidTome version to be slower. Is it normal?

Feb 14 '25 09:02 Ree1s

Yes, it is normal. As we have two-stage token merging and unmerging operations around each self-attention module, they add computation overhead compared to processing each frame separately (w.o. VidToMe).

VidToMe improves efficiency against direct self-attention extension, which jointly processes tokens from all frames in the self-attention modules.

Feb 14 '25 13:02 lixirui142

VidToMe VidToMe copied to clipboard

About efficiency

VidToMe
VidToMe copied to clipboard