TransformerEngine
TransformerEngine copied to clipboard
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization i...
# Description This change adds the new quick start notebook for Jax, mirroring the same Transformer architecture as the PyTorch guide, for users familiar with the PyTorch guide can also...
Docs fix
# Description Our documentation returned a lot of warnings and it seems that some of them were rational. It turned out that half of our PyTorch API was not rendered....
# Description Connects TE/common cuBlasMp bindings to the PyTorch GEMM call. Note: I will likely close and re-submit this PR after cleaning up the git history issues. ## Type of...
# Description This PR: - removes `nvte_fused_attn_fwd_qkvpacked`, `nvte_fused_attn_bwd_qkvpacked`, `nvte_fused_attn_fwd_kvpacked`, `nvte_fused_attn_bwd_kvpacked` APIs, and leaves `nvte_fused_attn_fwd` and `nvte_fused_attn_bwd` for uniformity and easier maintainance, - improves error messaging in `nvte_get_fused_attn_backend` (ultimately the backend...
# Description Sliding Window Attention with CP for THD format is enabled with A2A communication. Fixes # (issue) ## Type of change - [ ] Documentation change (change only to...
# Description Makes test tolerances stricter for TE/JAX tests in test_layer.py ## Type of change - [ ] Documentation change (change only to the documentation, either a fix or a...
# Description Please include a brief summary of the changes, relevant motivation and context. Fixes # (issue) ## Type of change - [ ] Documentation change (change only to the...
# Description - Adding TopK fusion to JAX for both forward and backward. Fixes # (issue) ## Type of change - [ ] Documentation change (change only to the documentation,...
Hi, I’m running into some compilation issues when trying to compiling some of the release branches from source and build a wheel out of them. (attached a reproducible example at...
Hello, I am trying to install transformer-engine using pip install --no-build-isolation transformer_engine[pytorch] on two different systems but I always get compilation errors: Personal system (Nvidia 3090, torch 2.6): ``` Building...