AITemplate
AITemplate copied to clipboard
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Summary: This diff extends D44374161 in the following ways: one criticism with respect to compile_model locking it's build dir is, that this is kind of unexpected behavior and not part...
This commit will decode the compute capability from `nvidia-smi`, and allow `_detect_cuda` to return an appropriate value for GPUs other than the T4, V100, and A100 (such as the RTX...
Issue: StableDiffusionAITPipeline and StableDiffusionImg2ImgAITPipeline in pipeline_stable_diffusion_ait.py and [pipeline_stable_diffusion_img2img_ait.py has workdir hardcoded to "tmp/" which might create problems where the file system locations are either read-only or users have restricted permissions...
I have a group of fine-tuned stable diffusion models using the same training process. After compiling one model into AIT format, is it possible to reuse the "selected" AIT graph...
Based on what's present, it seems like it's possible to use AI template entirely from native code with no python runtime. Could you add a example which illustrates how to...
It's really not suitable to force users to always provide a nvcc for compilation at runtime. It makes deployments brittle and very complicated, particularly of native code. Nvidia provides nvrtc...
Reviewed By: frank-wei Differential Revision: D43677477
Adding support for StreamK for GEMM operations (except for group GEMM). As workspace calculations are hard to cap for a particular range of shapes (and max shape calcualation doesn't always...
Even if Seed is given to generator to generate, the result changes slightly each time. ``` def torch_fix_seed(seed=42): # Python random random.seed(seed) # Numpy np.random.seed(seed) # Pytorch torch.manual_seed(seed) torch.cuda.manual_seed(seed) torch.backends.cudnn.deterministic...
when I run examples, an error An error occurred: ` File "/opt/conda/lib/python3.8/site-packages/aitemplate/backend/cuda/conv2d/conv2d_bias_relu_few_channels.py", line 47, in conv2d_bias_relu_few_channels_config func_attrs["op_instance"] = cfc.extract_config( File "/opt/conda/lib/python3.8/site-packages/aitemplate/backend/cuda/conv2d/common_conv2d_few_channels.py", line 49, in extract_config return common.extract_config( File "/opt/conda/lib/python3.8/site-packages/aitemplate/backend/cuda/conv2d/common.py", line...