AITemplate issues

Support non-square sizes for stable diffusion like 640x384 don't seem to work

From @terrychenism "the group norm problem size is not supported yet." My diff: ``` diff --git a/examples/05_stable_diffusion/compile.py b/examples/05_stable_diffusion/compile.py index 513df5b..790f3c0 100644 --- a/examples/05_stable_diffusion/compile.py +++ b/examples/05_stable_diffusion/compile.py @@ -177,8 +177,8 @@ def...

Suhail

[WIP] Sync v0.11

WIP PR, DO NOT USE NOW. Updates: - New Xformer attention codegen, >20% speed up on Stable Diffusion - New Xformer dual gemm codegen - Various new utility codegen -...

antinucleon

CLA Signed

module: rocm

Update README.md

HuggingFace -> Hugging Face

eltociear

CLA Signed

Add fp32 support for stablediffusion

2

Want to be able to use stablediffusion at different precision, starting with fp32 for the simplest case python scripts/compile.py --dtype=float32 --use-fp16-acc=False currently stuck at https://gist.github.com/benjibc/838476ee6b5ff326eb6a94ef87b31cd2

benjibc

CLA Signed

Include split+cat in fuse_split optimization

4

This change extends _fuse_split_and_strided_op to also optimize split followed by cat (when both are on the same dim). The split op is removed and the input_accessors of the cat op...

erjiang

CLA Signed

Error when running compile_alt.py in stable diffusion example: list index out of range in conv2d

4

Repro steps: ``` docker exec -it bash cd AITemplate/examples/05_stable_diffusion pip install accelerate python3 scripts/compile_alt.py --local-dir tmp/diffusers-pipeline/stabilityai/stable-diffusion-v2/ ``` Errors after a while with: ``` Traceback (most recent call last): File "scripts/compile_alt.py",...

yit-b

IN PROGRESS, add an initial support of CMake and MSVC compiler

11

Summary: in progress. Some unit tests have started finish successfully on an AWS machine, both Linux and Windows one. use `AIT_USE_CMAKE_COMPILATION=1` environment flag # Linux * AWS g4dn.xlarge with 24GB...

alexanderguzhva

CLA Signed

fb-exported

Issues running AIT benchmarking tools for BERT on RTX 4080

2

Hello, I'm running the benchmarking tools on BERT. For the sequence length = [1,2,4,8,64,128,384], it worked well. However, if I choose sequence length = [512, 1024, 4096], it failed even...

chsungen

add attention backend

fsx950223

CLA Signed

module: rocm

upstream gemm and embeddings

fsx950223

CLA Signed

module: rocm

AITemplate
AITemplate copied to clipboard

Metadata

Support non-square sizes for stable diffusion like 640x384 don't seem to work

[WIP] Sync v0.11

Update README.md

Add fp32 support for stablediffusion

Include split+cat in fuse_split optimization

Error when running compile_alt.py in stable diffusion example: list index out of range in conv2d

IN PROGRESS, add an initial support of CMake and MSVC compiler

Issues running AIT benchmarking tools for BERT on RTX 4080

add attention backend

upstream gemm and embeddings

← Metadata

Owner

Metadata

AITemplate AITemplate copied to clipboard

Metadata

← Metadata

Owner

Metadata

AITemplate
AITemplate copied to clipboard