pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Results 2350 pytorch issues
Sort by recently updated
recently updated
newest added

See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #129762...

open source
better-engineering
topic: not user facing
module: dynamo

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #129802 * __->__ #129801 * #129800

oncall: distributed

### 🐛 Describe the bug Hello, I encountered some issues while using torch.distributed.pipelining. I tested PiPPy/examples/huggingface/pippy_gpt2.py with the default configuration. Because I'm working on full model testings, I added a...

oncall: distributed
module: pipelining

This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned audio hash.

open source
ciflow/trunk
topic: not user facing
ciflow/inductor

## Issue description > RuntimeError: CUDA error: unspecified launch failure Error occurring on any training script. Occurrence is not deterministic. Can occur at anytime during the course of training. All...

module: cudnn
module: cuda
triaged

### Approach: Using the current function declaration **Constraint:** Q_Heads % KV_Heads == 0 **Major change:** It adds a meaning to the last third dimension. **Pros:** This approach covers one major...

### 🐛 Describe the bug According to the documentation `torch.distributed.tensor.parallel.SequenceParallel` should shard on the sequence dimension i.e. `[B, T, C] -> [B, T//_world_size, C]` but it seems to be tiling...

oncall: distributed
triaged

FSDP2 eager pre-allocates the output buffer for AllGather and the AllGather just writes into that buffer. However, under compile, by default we use out-of-place AllGather, which means in Traceable FSDP2...

oncall: distributed
release notes: distributed (fsdp)
module: inductor
ciflow/inductor

Fixes #ISSUE_NUMBER

open source
topic: not user facing

Fixes #95481 Test Plan: Unit tested checkpoint_wrapper.py by instantizing ActivationWrapper and got TypeError as expected. cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @fegin @XilunWu @wanchaol...

oncall: distributed
ciflow/trunk
module: distributed