Driss Guessous

Results 39 issues of Driss Guessous

# Summary Today whenever a user runs autoquant, the [AutoQuantCache](https://github.com/pytorch/ao/blob/e1039abac7f429a8d7f489d047d9b34d6ac6afe2/torchao/quantization/autoquant.py#L33) gets populated with dtype + information for Linears seen within an arbitrary torch.nn.Module. This cache is not persistent. We should...

autoquant

Stacked PRs: * #709 * __->__#707 * #706 --- --- --- ### add rowwise-scaling to fp8 inference Using this script to exercise: ```Python import torch import copy import torch.nn as...

CLA Signed

Stacked PRs: * __->__#709 * #707 * #706 --- --- --- add test to hf_eval.py

CLA Signed

Stacked PRs: * __->__#38 --- --- --- ### small tweaks

## Current State of OSS FP8 Operators So far, all examples of fp8 ops (compute in fp8) are scaled matmuls that accumulate in a higher precision type. In fact, there...

rfc

# Summary This can have large performance impact in real Attention modules. The most common pattern (derived from nano-gpt) ```Python import torch import torch.nn as nn import torch.nn.functional as F...

high priority
triage review
module: cudnn
module: multi-headed-attention

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #137452 * #141185 * #141164 # Summary The follow to PR

module: cpu
topic: not user facing
module: inductor
ciflow/inductor
module: flex attention

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #137452 * #141185 * __->__ #141164 # Summary We have another IMA for captured buffers when we are the sequences are not divisible....

topic: not user facing
module: inductor
ciflow/inductor
ciflow/rocm
module: flex attention

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #137452 * __->__ #141185 * #141164 cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler...

topic: not user facing
module: inductor
module: dynamo
ciflow/inductor