AITemplate
AITemplate copied to clipboard
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
run step: ``` # build docker image ./docker/build.sh cuda # run docker docker run -it --gpus=all ait:latest bash # run scripts cd /AITemplate/examples/05_stable_diffusion python3 scripts/download_pipeline.py python3 scripts/compile.py ``` error log:...
I use aitmplate for stable diffusion inference and complie success, but img2img result is different from pytorch diffusers img2img result. Two results are basically the same, but there are differences...
When will these two features be finished? 1) Quantization: fp8/int8/int4. 2) Sparsity pruning for Gemm. I am very much looking forward to :)
Any plan to support attention mask in BERT?
When I try to compile simple convolution network, compilation process crash because conv2d.attrs_['op_instance'] is empty for the convolution layer. How can I fix it? Behaviour can be reproduced by this...
Attempting to run the 04_vit benchmarking tool (benchmark_ait.py) hits an error when trying to load an op instance for one of the conv ops. ``` Traceback (most recent call last):...
### Is your feature request related to a problem? Please describe. I would like to request the implementation of a compressed tiled matrix multiply operator for use in large language...
Hi! It would be great to have example (like with SD, detectotrone and resnet) for MiDaS Depth estimation model.
Hi, I use the diffusion depth2img model, which needs to use vae. encode. So I want to convert diffusion vae.encode into ait model. However, the torch.nn.functional.pad function is required in...