accelerate
accelerate copied to clipboard
doesn't move data to cuda on Windows VScode environment
System Info
- `Accelerate` version: 0.18.0
- Platform: Windows-10-10.0.19044-SP0
- Python version: 3.9.12
- Numpy version: 1.22.3
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: NO
- mixed_precision: fp16
- use_cpu: False
- num_processes: 1
- machine_rank: 0
- num_machines: 1
- gpu_ids: all
- rdzv_backend: static
- same_network: True
- main_training_function: main
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
) - [X] My own task or dataset (give details below)
Reproduction
from accelerate import Accelerator
import torch
# Initialize the accelerator
accelerator = Accelerator()
# Check the device being used by Accelerate
print(f'Accelerate device: {accelerator.device}')
# Create some data
x = torch.randn(32, 3, 224, 224)
y = torch.randint(0, 10, (32,))
# Move the data to the device
x, y = accelerator.prepare((x, y))
# Check if the data is on the GPU
print(f'x is on cuda: {x.is_cuda}')
print(f'y is on cuda: {y.is_cuda}')
x = x.to('cuda')
y = y.to('cuda')
# Check if the data is on the GPU
print(f'x is on cuda: {x.is_cuda}')
print(f'y is on cuda: {y.is_cuda}')
Expected behavior
Accelerate device: cuda
x is on cuda: False
y is on cuda: False
x is on cuda: True
y is on cuda: True
This is the output on my machine, both x,y should output 'True' for is_cuda after accelerator.prepare((x, y))
prepare
will only move the model, dataloaders, scheduler, and optimizer. it doesn't work for individual tensor inputs. Please create a dataloader and prepare this, or do x = x.to(accelerator.device)
I got a different error when using dataloader with num_workers, is this a known issue?
from accelerate import Accelerator
from torch.utils.data import DataLoader, Dataset
from torch import nn
import torch
# Define a dummy dataset
class MyDataset(Dataset):
def __len__(self):
return 100
def __getitem__(self, idx):
x = torch.randn(3, 224, 224)
y = torch.randint(0, 10, (1,))
return x, y
# Initialize the accelerator
accelerator = Accelerator()
# Create a DataLoader with num_workers=8
dataset = MyDataset()
dataloader = DataLoader(dataset, batch_size=32,
num_workers=1, persistent_workers=True
)
# Define your model
model = nn.Sequential(
nn.Flatten(),
nn.Linear(3 * 224 * 224, 10)
)
# Define your optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
# Define your loss function
loss_fn = nn.CrossEntropyLoss()
model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
# Train your model
for epoch in range(10):
for x, y in dataloader:
# Compute the loss
y_pred = model(x)
loss = loss_fn(y_pred, y.squeeze())
1144 if len(failed_workers) > 0:
1145 pids_str = ', '.join(str(w.pid) for w in failed_workers)
-> 1146 raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
1147 if isinstance(e, queue.Empty):
1148 return (False, None)
RuntimeError: DataLoader worker (pid(s) 27368) exited unexpectedly
the same dataloader can run fine without using accelerate
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.