Add ToMe to the techniques used by Speedster

Open mfumanelli opened this issue 2 years ago • 0 comments

Description

ToMe (Token Merging) is an optimization technique recently released by Meta AI. It enables acceleration for Visual Transformers by merging tokens with very similar information, and thus speeding up attention.

After some analysis, it is shown that ToMe allows greater acceleration of models when using the CPU than has been achieved so far with Speedster for Visual Transformers. As for the GPU, there is improvement in ViT speed only when using high batch sizes.

I think it might be useful to implement among the different techniques used by Speedster also ToMe, this to make inference for this type of model on CPU even more efficient.

Useful materials

Analysis ToMe vs Speedster notebook Analysis ToMe vs Speedster blog Meta AI's blog Meta AI's paper Meta AI's repo

Feb 17 '23 15:02 mfumanelli

nebuly nebuly copied to clipboard

Add ToMe to the techniques used by Speedster

Description

Useful materials

nebuly
nebuly copied to clipboard