DefTruth

Results 39 issues of DefTruth

### System Info version 4.44.1 ### Who can help? @ArthurZucker ### Information - [ ] The official example scripts - [X] My own modified scripts ### Tasks - [ ]...

bug

## Description ## Environment **TensorRT Version**: **NVIDIA GPU**: **NVIDIA Driver Version**: **CUDA Version**: **CUDNN Version**: Operating System: Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal...

## TODO - [ ] swish kernel - [ ] gelu kernel - [ ] RoPE kernel - [x] pack elementwise_add - [x] pack sigmoid - [x] pack relu -...

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` ### 🐛 Describe the bug ```bash Fri Feb 28 11:23:46 2025 +-----------------------------------------------------------------------------------------+ |...

bug

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` - L20 x 3, 72 GPUs - TP=8, PP=3 - vLLM commit: 811a46bf06f872c28147f957b3a9d18d97d1c1ad...

bug

Remove redundant Exp calculations

This PR introduce cache-dit to nunchaku, support Qwen-Image and FLUX.1 series: inference w/ cache-dit + nunchaku 4-bits @lmxyy ## 📚Core Features of CacheDiT - **[🎉Full 🤗Diffusers Support](https://github.com/vipshop/cache-dit/tree/main/docs/User_Guide.md#supported-pipelines)**: Notably, **[cache-dit](https://github.com/vipshop/cache-dit)** now...

How to use quantizer after pipeline loaded? - Currently ```python # Quantization occurs at load time. pipe = QwenImagePipeline.from_pretrained( ( args.model_path if args.model_path is not None else os.environ.get( "QWEN_IMAGE_DIR", "Qwen/Qwen-Image",...

## 🤖UAA: Ulysses Anything Attention We have implemented the **[📚UAA: Ulysses Anything Attention](https://github.com/vipshop/cache-dit/blob/main/docs/User_Guide.md#uaa-ulysses-anything-attention)**: An Ulysses Attention that supports **arbitrary sequence length** with ✅**zero padding** and **nearly ✅zero theoretical communication overhead**....