AITemplate issues

What is the best way to accept uint8 input

1

`float16` is not CPU-friendly and `float32` input is unnecessarily large (if we are to add data marshaling). I usually pass the input as bytes (uint8), then convert to float16 inside...

dashesy

Eiminate elementwise no-ops (*/1, +-0)

1

Summary: This diffs adds an optimizations to the sorted graph to skip element-wise int operation is that no-ops, i.e. multiplied or divided by 0 or add or subtracted by 0...

AlbertDachiChen

CLA Signed

fb-exported

Got cutlass error: Error Internal at: 214

1

Hi, when I run 05_stable_diffusion# python3 src/benchmark.py, there is a error: pt output: torch.Size([1, 77, 1024]) [gemm_rcr_bias_add_25.cu] Got cutlass error: Error Internal at: 214 [20:21:02] model_interface.cu:221: Error: [gemm_rcr_bias_add_25.cu] Got cutlass...

syntrive

Docker image fails to build due to python dependency issue

2

When attempting to build the docker image as per the README: ``` git clone --recursive https://github.com/facebookincubator/AITemplate cd AITemplate ./docker/build.sh cuda ``` The image fails to build with the below error:...

M1kep

error during inferencing: Error: Constant embeddings_token_embedding_weight was not set! Set the value with set_constant.

2

Hi team, Thank you for your nice work! I met the error during inference. Stable-diffusion 1.5 in sample file. ``` [11:48:42] model_container.cu:87: Init AITemplate Runtime with 1 concurrency [11:48:42] model_container.cu:69:...

mengbingrock

Does Concatenate order matters?

1

I have below code, which concats the input tensor before conv layer, this is the original corresponding Pytorch code: `x = torch.nn.functional.pad(x, pad, mode="constant", value=0)`. In AIT, assume the input...

ecilay

Add log1p elementwise op

5

Summary: `log1p(x)` is more precise than `log(1+x)` when `x` is close to 0. We utilize cuda `log1pf` implementation for fp32. For other precision types, input is first converted to float,...

22quinn

CLA Signed

fb-exported

Add floor op

3

Differential Revision: D54332190

henryhu6

CLA Signed

fb-exported

Model is successfully compiled, but OOM when loading

Hi AIT team: I'm working on compiling a generative video model into AIT. I can successfully compile the model, as you can see here: ``` 2024-02-09 07:20:35,614 INFO max_blob=19546740864 constant_offset=7630531776...

jiangwei221

multi-gpu at runtime error

4

So say if I have two AIT convered models, `model0` on `cuda0` and `model1` on `cuda1`. Even if I used `cudaSetDevice` to load the models properly on each cuda device,...

ecilay

AITemplate
AITemplate copied to clipboard

Metadata

What is the best way to accept uint8 input

Eiminate elementwise no-ops (*/1, +-0)

Got cutlass error: Error Internal at: 214

Docker image fails to build due to python dependency issue

error during inferencing: Error: Constant embeddings_token_embedding_weight was not set! Set the value with set_constant.

Does Concatenate order matters?

Add log1p elementwise op

Add floor op

Model is successfully compiled, but OOM when loading

multi-gpu at runtime error

← Metadata

Owner

Metadata

AITemplate AITemplate copied to clipboard

Metadata

← Metadata

Owner

Metadata

AITemplate
AITemplate copied to clipboard