examples issues

Example Tensor Parallelism Optimizer Bug

## 📚 Documentation I believe the optimizer in this example should be declared after the parallelize module call, as in sequence parallelism. Without this, in latest torch, the example seems...

nrothGIT

Cuda memory usage does not decrease when increasing the number of cuda cards (fsdp_tp_example.py).

According to the implementation of the source code, I did several experiments to study the script running time and cuda memory occupancy. - exp1: nproc_per_node=4, nnodes=1 => cuda=2161~2411MB, runtime=63.04s -...

YangHui90

Error While Running distributed/FSDP/T5_training.py

1

## Context While running the `distributed/FSDP/T5_training.py` example, I encountered an error when loading the `wikihow` dataset. I would like to know if this is a bug or if there is...

nariaki3551

"RuntimeError: HIP error: invalid device function" when running "mnist" on 7900XTX

## Context * Pytorch version: 2.6.0+rocm6.2.4 * Operating System and version: Ubuntu 24.04.2 LTS x86_64 ## Your Environment * Installed using source? [yes/no]: no * Are you planning to deploy...

SuGotLand

Fixed "--help" output in README.md

1

Updated arguments to match what main.py is looking for. Fixed incorrectly listed defaults, removed duplicate "--save-model", copied "--save_model" description from main.py.

asmith512

cla signed

Some isues with the language translation

I tried the language translation examples, there are several issues: 1.python compatable issues, I changed torch to 2.3.0 and torchtext to 0.18.0, otherwise it will not work on mac. 2....

guoyuf

Better device handling

2

I can't reproduce the issue of every process allocating memory of GPU 0 (https://github.com/pytorch/examples/issues/969), so maybe the underlying issue has been fixed. Regardless, usage of `torch.cuda.set_device` is [now discouraged](https://pytorch.org/docs/stable/generated/torch.cuda.set_device.html) in...

EIFY

cla signed

Updated regression example inline with other examples

1

Regression example was not updated recently and updated the script to make look similar to other examples.

boringbyte

cla signed

multigpu_torchrun.py does not show speed up when training on multi GPUs!

I have tried the same example provided on [multigpu_torchrun.py](https://github.com/pytorch/examples/blob/main/distributed/ddp-tutorial-series/multigpu_torchrun.py) and trained MNIST dataset and replaced the model with a simple CNN model. However, when increasing the number of GPUs in...

MostafaCham

fixed validate fn to match train fn in imagnet main file

3

The transfer to the device was not consistent in the train and validate fn, so I just matched validate with train. It also reduced a couple of inconsistent calls and...

ahmadmughees

cla signed

examples
examples copied to clipboard

Metadata

Example Tensor Parallelism Optimizer Bug

Cuda memory usage does not decrease when increasing the number of cuda cards (fsdp_tp_example.py).

Error While Running distributed/FSDP/T5_training.py

"RuntimeError: HIP error: invalid device function" when running "mnist" on 7900XTX

Fixed "--help" output in README.md

Some isues with the language translation

Better device handling

Updated regression example inline with other examples

multigpu_torchrun.py does not show speed up when training on multi GPUs!

fixed validate fn to match train fn in imagnet main file

← Metadata

Owner

Metadata

examples examples copied to clipboard

Metadata

← Metadata

Owner

Metadata

examples
examples copied to clipboard