Driss Guessous

Results 58 issues of Driss Guessous
trafficstars

# Summary This PR adds FlexAttention as a new unified_attention backend for the V1 engine. This requires torch 2.7+ since we fixed a number of dynamic shapes issues that show...

documentation
needs-rebase
ci/build
v1

Stacked PRs: * __->__#88 --- --- --- hacked up ```Shell ❯ torchfix auto_deprecate.py auto_deprecate.py:8:15: TOR101 [*] Use of deprecated function torch.nn.functional.soft_margin_loss --- /home/drisspg/meta/scripts/misc/auto_deprecate.py +++ /home/drisspg/meta/scripts/misc/auto_deprecate.py @@ -3,7 +3,7 @@ import...

CLA Signed

Stacked PRs: * __->__#1190 --- --- --- Add mxfp8 path ```Shell with-proxy CONFIG_FILE="torchtitan/models/llama3/train_configs/llama3_8b.toml " ./run_train.sh --model.print_after_conversion --training.compile --training.steps 50 --model.converters mxfloat8 --float8.recipe_name "mxfp8" ``` ## Review highlight I wish we...

CLA Signed

Stacked PRs: * __->__#1189 --- --- --- Add mxfp8 path

CLA Signed

See https://github.com/pytorch/pytorch/issues/147551#issuecomment-2683700299

Stacked PRs: * #2258 * __->__#2256 * #2253 --- --- --- Fixes: https://github.com/pytorch/ao/issues/2182 Add a way to do power of 2 scaling

CLA Signed
float8
topic: new feature

Stacked PRs: * __->__#2219 --- --- --- Manually specify flags if no arch set

CLA Signed
topic: not user facing

# VLLM Torch.compile Issue Tracker ## Summary This document tracks the existing issue with the way VLLM uses `torch.compile` and tensor subclasses. **TLDR**: VLLM doesn't setup `aotdispatch` correctly, causing subclass...

tracker
inference

# MXFP Inference and Performance Tracking ## Summary This issue tracks performance and E2E integration of MXFP formats (MXFP8, MXFP4, NVFP4) on B200 and other devices. ## Status Overview |...

tracker
mx
performance
topic: performance

# Summary See: https://github.com/pypa/pip/issues/6334

triaged
build