Driss Guessous

Results 39 issues of Driss Guessous

First of all I wanna say thank you for making such an amazing tool! It has greatly improved my work flow so can't thank you enough! I am often astonished...

# Summary We are currently working on integrating a fp8 scaled matmul kernel written using Cutlass into PyTorch. PyTorch has the constraint that it can be linked against the cuda...

feature request

# Summary https://github.com/pytorch/pytorch/pull/125204 Is failing on windows with: ``` Shell 2024-06-03T10:06:46.7711483Z C:/cb/pytorch_1000000000000/work/aten/src/ATen/../../../third_party/cutlass/include\cutlass/uint128.h(189): error: calling a __host__ function("_udiv128") from a __host__ __device__ function("cutlass::uint128_t::operator / const") is not allowed 2024-06-03T10:06:46.7712785Z 2024-06-03T10:06:46.7713366Z 1...

bug
? - Needs Triage

# Summary Sm90a is needed to enable some features found in Cutlass. For some reason google as the worst SEO for finding more information about this gencode. This is the...

cla signed

# Summary For more details see this PyTorch issue: https://github.com/pytorch/pytorch/issues/131257 I was able to reproduce on: 74b0761ff7efc7b90d4e5aeb529c1b2a09a7458c with following script: ``` Python import torch import torch.nn.functional as F from torch.nn.attention...

# Summary * Switch the ordering of the authors * Add in h100 perf charts

# Summary Currently we have two "eval" scripts for measuring performance of LLMs post quantization: https://github.com/pytorch/ao/blob/main/torchao/_models/llama/eval.py, https://github.com/pytorch/ao/blob/main/scripts/hf_eval.py The default task we have is wikitext. We should create a "large" eval...

enhancement

# Summary Autoquant will iterate through a user module and identify all linear dtype + shapes as well as execution time for different quantization routines. This information is baked into...

autoquant

Stacked PRs: * #709 * #707 * __->__#706 --- --- --- ### add ability to calculate amax in tiles ghstack-source-id: 83ccec3ec66f30b9d75146d0fc7b1137ea7574c4 Pull Request resolved: https://github.com/pytorch/ao/pull/682

CLA Signed

# Summary Currently our CI/CD pipline uses ruff to format and lint files in the codebase. They are hardcoded to the list in ruff.toml. We should also add mypy support...

enhancement