AITemplate icon indicating copy to clipboard operation
AITemplate copied to clipboard

Support arbitrary width & height in stable diffusion example

Open Suhail opened this issue 1 year ago • 3 comments

It would be nice to be able to use different heights and widths up to 1024x1024.

Suhail avatar Oct 29 '22 13:10 Suhail

If someone is interested in working on this, I will pay you a $10,000 prize for doing it. DM me on Twitter if so: https://twitter.com/suhail

Performance (throughput, memory) should not degrade more than 15%

Deadline: Nov 30, 2022 (whoever is first)

Suhail avatar Oct 29 '22 22:10 Suhail

For anyone who is interested, here is how to do it for basic enablement:

  1. check dynamic codegen using a MIN strategy to create a default kernel instance
  2. maybe need to unfuse gemm + permute

If you know how to do it probably is 2 hours work.

For best performance:

Method 1: Need to learn how HINT profiling is working, and consider a better layout for gemm + permute fusion (more complex) Method 2: Do not fold weights during compiling, and compile multiple instances for different shape, and do bucketing. To avoid memory waste, modify codegen to pass blob memory from external (3-4 hours if you know what to do)

We will release some code to make static shape running 20% faster, to make current PyTorch pipeline running around 1 sec at batch 1. We can see it is able to run under 1 sec, maybe extra 10% - 20% even after the 20% speedup, but our job is more on Meta's internal workload, rather than optimizing diffusion models, so it may take a while.

antinucleon avatar Oct 30 '22 16:10 antinucleon

FYI: v0.1.1 is released: https://github.com/facebookincubator/AITemplate/pull/74

New attention is more friendly to dynamic shape, and new runtime supports external memory allocators.

antinucleon avatar Nov 09 '22 21:11 antinucleon