andrewor14

Results 16 issues of andrewor14

Following https://github.com/pytorch/pytorch/pull/74128 and https://github.com/pytorch/pytorch/pull/74362, this would be part 3 of the effort to reduce code duplication in the code that lowers reference quantized patterns to native quantized ops in fbgemm/qnnpack....

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #85068 **Summary:** This commit enables the custom module LSTM path for FX graph mode static quantization. This has the same flow as...

cla signed
release notes: quantization
fx

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #119496 * __->__ #116092 **Summary:** This commit simplifies the existing decomposition hierarchy of batch norm ops by adding a single, backend agnostic op:...

oncall: distributed
module: cpu
release notes: mps
ciflow/mps
module: inductor
module: dynamo
ciflow/inductor

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #119496 Summary: This commit switches `aten.batch_norm` to call the new `batch_norm_with_update` and `batch_norm_no_update` ops, instead of the old `_batch_norm_impl_index` op. The new...

release notes: quantization
keep-going

Summary: Add a new quantization for users to quantize their models using int8 per token dynamic activation + int4 per axis grouped weight quantization. Test Plan: ``` tune run quantize...

CLA Signed

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #125208 Summary: This commit fixes the pattern matching for conv-bn during QAT fusion where both weight and bias are quantized per channel....

ciflow/trunk
release notes: quantization

**Summary:** This commit adds the option to run quantization-aware training (QAT) during finetuning. QAT refers to "fake quantizing" the weights and activations during training, which performs the following transformation on...

CLA Signed

Summary: Many changes have gone into the full_finetune_distributed recipe but were not reflected on the equivalent qat_distributed recipe. This commit brings the latter up to date, adding features like FSDP2,...

CLA Signed

Currently torchao QAT has two APIs, [tensor subclasses](https://github.com/pytorch/ao/blob/a4221df5e10ff8c33854f964fe6b4e00abfbe542/torchao/quantization/prototype/qat/api.py#L41) and [module swap](https://github.com/pytorch/ao/blob/a4221df5e10ff8c33854f964fe6b4e00abfbe542/torchao/quantization/prototype/qat/_module_swap_api.py#L39). The original plan was to deprecate and eventually remove the old module swap API in favor of the tensor...

rfc

**Summary:** Following https://github.com/pytorch/ao/issues/987, this commit makes module swap the main QAT flow today. We remove all tensor subclass fake quantize injection logic since this is not needed in both the...

CLA Signed