AITemplate icon indicating copy to clipboard operation
AITemplate copied to clipboard

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Results 178 AITemplate issues
Sort by recently updated
recently updated
newest added

It's hard to find it in examples and documents, so I leave it as a question. Is the function and structure of AITemplate a structure that can convert and service...

Hi, I am trying the stable diffusion in the example https://github.com/facebookincubator/AITemplate/tree/main/examples/05_stable_diffusion But get the following error when compiling the model with python3 examples/05_stable_diffusion/compile.py --token ACCESS_TOKEN ``` File "examples/05_stable_diffusion/compile.py", line 379,...

Summary: Currently we don't allow AIT's inputs to be a subset of python's inputs. This could cause us some trouble. e.g. https://www.internalfb.com/phabricator/paste/view/P619446970?lines=37 Here, `add_14` can be deduced at a very...

CLA Signed
fb-exported

Differential Revision: D43228829

CLA Signed
fb-exported

Hi all, It would be great to have a fx2ait example for stable diffusion. It will help provide a tutorial on how to use fx2ait for complex pipelines. The part...

Summary: This diff is reverting D42977698 (https://github.com/facebookincubator/AITemplate/commit/5173b284ebfef102ad1ab4a46ec2b9604f1f3275)

CLA Signed
fb-exported

Hey there, I am trying run AITemplate Stable diffusion examples on T4 GPU. I have tried with the same package version as described in the README using master and this...

Hello, After model compilation, will the gradient computation through input tensors be available during inference time? Typically, inference libraries do not support backpropagation through model parameters because the model parameters...

When we want to customize stable diffusion (changing height, width and batch size), we need to recompile entire SD model. Is there any way that we can work with multiple...

Hello, I ran Bert example on MI-250x by using command: python3 examples/03_bert/benchmark_ait.py --batch-size 32 --seq-length 512 --encoders-only false However, it aborted with the following errors: ./tmp/BERT_fast_gelu_32_512/batch_gather_1.cpp:27: int64_t (anonymous namespace)::GetInOffset(const int64_t,...