accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

doesn't move data to cuda on Windows VScode environment

Open andycjw opened this issue 1 year ago • 3 comments

System Info

- `Accelerate` version: 0.18.0
- Platform: Windows-10-10.0.19044-SP0
- Python version: 3.9.12
- Numpy version: 1.22.3
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- `Accelerate` default config:
        - compute_environment: LOCAL_MACHINE
        - distributed_type: NO
        - mixed_precision: fp16
        - use_cpu: False
        - num_processes: 1
        - machine_rank: 0
        - num_machines: 1
        - gpu_ids: all
        - rdzv_backend: static
        - same_network: True
        - main_training_function: main
        - downcast_bf16: no
        - tpu_use_cluster: False
        - tpu_use_sudo: False
        - tpu_env: []

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • [X] My own task or dataset (give details below)

Reproduction

from accelerate import Accelerator
import torch
# Initialize the accelerator
accelerator = Accelerator()

# Check the device being used by Accelerate
print(f'Accelerate device: {accelerator.device}')

# Create some data
x = torch.randn(32, 3, 224, 224)
y = torch.randint(0, 10, (32,))

# Move the data to the device
x, y = accelerator.prepare((x, y))

# Check if the data is on the GPU
print(f'x is on cuda: {x.is_cuda}')
print(f'y is on cuda: {y.is_cuda}')

x = x.to('cuda')
y = y.to('cuda')

# Check if the data is on the GPU
print(f'x is on cuda: {x.is_cuda}')
print(f'y is on cuda: {y.is_cuda}')

Expected behavior

Accelerate device: cuda
x is on cuda: False
y is on cuda: False
x is on cuda: True
y is on cuda: True

This is the output on my machine, both x,y should output 'True' for is_cuda after accelerator.prepare((x, y))

andycjw avatar Apr 27 '23 17:04 andycjw

prepare will only move the model, dataloaders, scheduler, and optimizer. it doesn't work for individual tensor inputs. Please create a dataloader and prepare this, or do x = x.to(accelerator.device)

muellerzr avatar Apr 27 '23 17:04 muellerzr

I got a different error when using dataloader with num_workers, is this a known issue?

from accelerate import Accelerator
from torch.utils.data import DataLoader, Dataset
from torch import nn
import torch

# Define a dummy dataset
class MyDataset(Dataset):
    def __len__(self):
        return 100

    def __getitem__(self, idx):
        x = torch.randn(3, 224, 224)
        y = torch.randint(0, 10, (1,))
        return x, y

# Initialize the accelerator
accelerator = Accelerator()

# Create a DataLoader with num_workers=8
dataset = MyDataset()
dataloader = DataLoader(dataset, batch_size=32, 
                        num_workers=1, persistent_workers=True
                        )

# Define your model
model = nn.Sequential(
    nn.Flatten(),
    nn.Linear(3 * 224 * 224, 10)
)

# Define your optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

# Define your loss function
loss_fn = nn.CrossEntropyLoss()

model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)

# Train your model
for epoch in range(10):
    for x, y in dataloader:
        # Compute the loss
        y_pred = model(x)
        loss = loss_fn(y_pred, y.squeeze())

   1144 if len(failed_workers) > 0:
   1145     pids_str = ', '.join(str(w.pid) for w in failed_workers)
-> 1146     raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
   1147 if isinstance(e, queue.Empty):
   1148     return (False, None)

RuntimeError: DataLoader worker (pid(s) 27368) exited unexpectedly

the same dataloader can run fine without using accelerate

andycjw avatar Apr 27 '23 18:04 andycjw

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar May 28 '23 15:05 github-actions[bot]