examples icon indicating copy to clipboard operation
examples copied to clipboard

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Results 210 examples issues
Sort by recently updated
recently updated
newest added

## 📚 Documentation I believe the optimizer in this example should be declared after the parallelize module call, as in sequence parallelism. Without this, in latest torch, the example seems...

According to the implementation of the source code, I did several experiments to study the script running time and cuda memory occupancy. - exp1: nproc_per_node=4, nnodes=1 => cuda=2161~2411MB, runtime=63.04s -...

## Context While running the `distributed/FSDP/T5_training.py` example, I encountered an error when loading the `wikihow` dataset. I would like to know if this is a bug or if there is...

## Context * Pytorch version: 2.6.0+rocm6.2.4 * Operating System and version: Ubuntu 24.04.2 LTS x86_64 ## Your Environment * Installed using source? [yes/no]: no * Are you planning to deploy...

Updated arguments to match what main.py is looking for. Fixed incorrectly listed defaults, removed duplicate "--save-model", copied "--save_model" description from main.py.

cla signed

I tried the language translation examples, there are several issues: 1.python compatable issues, I changed torch to 2.3.0 and torchtext to 0.18.0, otherwise it will not work on mac. 2....

I can't reproduce the issue of every process allocating memory of GPU 0 (https://github.com/pytorch/examples/issues/969), so maybe the underlying issue has been fixed. Regardless, usage of `torch.cuda.set_device` is [now discouraged](https://pytorch.org/docs/stable/generated/torch.cuda.set_device.html) in...

cla signed

Regression example was not updated recently and updated the script to make look similar to other examples.

cla signed

I have tried the same example provided on [multigpu_torchrun.py](https://github.com/pytorch/examples/blob/main/distributed/ddp-tutorial-series/multigpu_torchrun.py) and trained MNIST dataset and replaced the model with a simple CNN model. However, when increasing the number of GPUs in...

The transfer to the device was not consistent in the train and validate fn, so I just matched validate with train. It also reduced a couple of inconsistent calls and...

cla signed