pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Results 2350 pytorch issues
Sort by recently updated
recently updated
newest added

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #83050 AOTAutograd retraces graph module produced by torch dynamo, this PR preserves the stack trace in the original fx.Node.

cla signed
fx

This PR implements an APEX style FusedAdam in PyTorch. This is different from the APEX one in that this is compatible with `torch.cuda.amp.GradScaler` by setting `_step_supports_amp_scaling` to `True` and unscales...

oncall: distributed
triaged
open source
cla signed

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #82841 * #82839 * #82837 * __->__ #82836

cla signed

## Description This PR improves performance of quantized kernel for normalize by vectorizing scalar remainder. In the current implementation [here](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp), the computation is vectorized while the scalar remainder is handled...

triaged
open source
cla signed
intel priority
intel

Summary: Currently `SelectiveBuilder` is hardcoding namespace `aten` for operators. This is not working anymore since operators started to have custom namespaces. This fixes it. Test Plan: Rely on newly added...

fb-exported
cla signed

### 🐛 Describe the bug Using MPS for BERT inference appears to produce about a 2x slowdown compared to the CPU. Here is code to reproduce the issue: ```python #...

triaged
module: mps

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #83137 * #83122 * #82874

cla signed

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #82841 * __->__ #82839 * #82837 * #82836

cla signed

This PR proposes a list of CPU-related PyTorch modules that Intel is willing to own or co-own.

triaged
open source
cla signed

* Fixes #78611 Reshape tensors witch are channels_last will get unexpected stride. * Fixes empty input convolution issue : when input is empty e.g. shape of (0, 3, 3, 4)...

triaged
open source
cla signed
intel priority
intel