xformers icon indicating copy to clipboard operation
xformers copied to clipboard

Consider removing `memory_efficient_fusion` from xformers

Open fmassa opened this issue 2 years ago • 5 comments

🐛 Bug

According to @Chillee in https://github.com/pytorch/pytorch/issues/86020#issuecomment-1264119302

Oh, we deleted the cache on AOTAutograd recently, so it doesn't recompile when the shape changes. The idea here is that AOTAutograd is intended to sit behind torchdynamo, which will handle all the caching, so the caching in AOTAutograd is basically unneeded code.

Given that it looks like the current behavior is wrong, it might be worth removing the uses of memory_efficient_fusion from xformers.

cc @dianaml0 for thoughts

fmassa avatar Oct 06 '22 20:10 fmassa

I think I'll probably add the cache back, since it seems like there's a couple more folks relying on the old behavior than I thought, and many people don't want to take on a torchdynamo dependency in the short term (i.e. next couple months).

Chillee avatar Oct 06 '22 21:10 Chillee

Thanks @Chillee, do you know when you'll add the cache back?

dianaml0 avatar Oct 06 '22 21:10 dianaml0

@dianaml0 I'll do it today.

Chillee avatar Oct 06 '22 21:10 Chillee

Actually, we plan on moving Dynamo into PyTorch core sooner than expected (this wek). When that's done, would it also be possible to replace the call to memory_efficient_fusion with dynamo.optimize? That should result in the same optimization being performed (although you'll also be able to switch to inductor if desired).

Chillee avatar Oct 10 '22 06:10 Chillee

@Chillee thanks for looking into this!

I think we should just remove then memory_efficient_fusion in the codebase. IIUC the goal for dynamo is to be called from the top-level repo (and enabled by default in the future), so having it live in library-code is going to be short-lived.

fmassa avatar Oct 10 '22 12:10 fmassa

Done in https://github.com/facebookresearch/xformers/pull/843

fmassa avatar Sep 05 '23 09:09 fmassa