accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

Problem with Webdataset

Open dguo98 opened this issue 2 years ago • 3 comments

System Info

- `Accelerate` version: 0.18.0
- Platform: Linux-5.4.0-139-generic-x86_64-with-glibc2.31
- Python version: 3.10.11
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- `Accelerate` default config:
        Not found

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • [X] My own task or dataset (give details below)

Reproduction

I tried to use webdataset with accelerate. I found that after accelerate.prepare(), the prompt for webdataset will give empty outputs, while before the accelerate.prepare(), everything is normal (i.e. it gives correct prompts).

how i set webdataset: dataset = ( wds.WebDataset(url) .shuffle(100) .decode("pil") .to_tuple("jpg;png", "txt") .map_tuple(transform, lambda x: x) ) dataloader = torch.utils.data.DataLoader(dataset, num_workers=4, batch_size=16)

running following lines before and after accelerate prepare() loader = iter(dataloader) image, prompt = next(loader) print(prompt)

Expected behavior

prompt becomes empty after accelerate.prepare()

dguo98 avatar Apr 29 '23 09:04 dguo98

cc @muellerzr

sgugger avatar May 01 '23 13:05 sgugger

@dguo98 I'll need a full reproducer as I don't have any experience with webdataset to help debug this.

What is url here? What is transform? How are you launching your script? And how many GPUs does your system have?

muellerzr avatar May 01 '23 14:05 muellerzr

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar May 29 '23 15:05 github-actions[bot]

I meet this problem too

yiyihum avatar Sep 19 '23 16:09 yiyihum

@dguo98 how did you solve the problem, could you kindly share?

Guptajakala avatar Mar 03 '24 07:03 Guptajakala