pytorch issues

Preserve node's stack trace during retrace

2

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #83050 AOTAutograd retraces graph module produced by torch dynamo, this PR preserves the stack trace in the original fx.Node.

SherlockNoMad

cla signed

fx

[mta] APEX style Fused Adam

6

This PR implements an APEX style FusedAdam in PyTorch. This is different from the APEX one in that this is compatible with `torch.cuda.amp.GradScaler` by setting `_step_supports_amp_scaling` to `True` and unscales...

crcrpar

oncall: distributed

triaged

open source

cla signed

[maskedtensor] first commit, core and creation

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #82841 * #82839 * #82837 * __->__ #82836

george-qi

cla signed

[Quant] Vectorize scalar remainder in quantized kernel for normalization

3

## Description This PR improves performance of quantized kernel for normalize by vectorizing scalar remainder. In the current implementation [here](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp), the computation is vectorized while the scalar remainder is handled...

Xia-Weiwen

triaged

open source

cla signed

intel priority

intel

[torchgen] Fix selective build error on custom namespace

2

Summary: Currently `SelectiveBuilder` is hardcoding namespace `aten` for operators. This is not working anymore since operators started to have custom namespaces. This fixes it. Test Plan: Rely on newly added...

larryliu0820

fb-exported

cla signed

MPS device appears much slower than CPU on M1 Mac Pro

26

### 🐛 Describe the bug Using MPS for BERT inference appears to produce about a 2x slowdown compared to the CPU. Here is code to reproduce the issue: ```python #...

mmisiewicz

triaged

module: mps

Added list clearing codegen to AOTAutograd (hidden behind config.aot_clear_list

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #83137 * #83122 * #82874

Chillee

cla signed

[maskedtensor] adding reductions

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #82841 * __->__ #82839 * #82837 * #82836

george-qi

cla signed

Propose code owners from Intel

5

This PR proposes a list of CPU-related PyTorch modules that Intel is willing to own or co-own.

jgong5

triaged

open source

cla signed

Fix reshape and empty input convolution issues for channels last memory format

1

* Fixes #78611 Reshape tensors witch are channels_last will get unexpected stride. * Fixes empty input convolution issue : when input is empty e.g. shape of (0, 3, 3, 4)...

CaoE

triaged

open source

cla signed

intel priority

intel

pytorch
pytorch copied to clipboard

Metadata

Preserve node's stack trace during retrace

[mta] APEX style Fused Adam

[maskedtensor] first commit, core and creation

[Quant] Vectorize scalar remainder in quantized kernel for normalization

[torchgen] Fix selective build error on custom namespace

MPS device appears much slower than CPU on M1 Mac Pro

Added list clearing codegen to AOTAutograd (hidden behind config.aot_clear_list

[maskedtensor] adding reductions

Propose code owners from Intel

Fix reshape and empty input convolution issues for channels last memory format

← Metadata

Owner

Metadata

pytorch pytorch copied to clipboard

Metadata

← Metadata

Owner

Metadata

pytorch
pytorch copied to clipboard