pytorch-deep-learning icon indicating copy to clipboard operation
pytorch-deep-learning copied to clipboard

problems with setting up dataloader

Open lazyman001 opened this issue 1 year ago • 4 comments

I set my dataloader like this :

 NUM_WOEKERS = os.cpu_count()

 train_dataloader= DataLoader(
     dataset=train_data_simple,
     batch_size=BATCH_SIZE,
     shuffle=True,
     num_workers=1
 ) 

and an error is reported when the num_workcer equals any number that is not zero. The error is shown in the picture. e7c6a01289149ea83eab3a006cf6373

lazyman001 avatar Aug 27 '24 03:08 lazyman001

Hey @lazyman001 ,

Where are you getting this issue?

What's the code you're trying to run?

Have you tried putting all your code into a main() function? And then calling if __name__ == "__main__": main()?

For example:


This error occurs because you're trying to use multiple worker processes in PyTorch, and you haven't properly protected the code that starts these processes. Specifically, you need to ensure that the multiprocessing module's main entry point is properly guarded by an if __name__ == '__main__': clause. This is required when using the multiprocessing module on platforms that don’t use the fork system call (such as Windows).

To fix this, make sure your script looks something like this:

python
Copy code
import torch
from torch.utils.data import DataLoader

# Your other imports and code here

def main():
    # Your training or data loading code here
    # Example:
    # dataset = YourDataset()
    # dataloader = DataLoader(dataset, num_workers=4)

    pass

if __name__ == '__main__':
    main()

Someone had a similar issue to this the other day, see: https://github.com/mrdbourke/pytorch-deep-learning/discussions/1059

If you're still having troubles, please post the code you're trying to run and describe where you're running it.

mrdbourke avatar Aug 28 '24 08:08 mrdbourke

This error occurs because you're trying to use multiple worker processes in PyTorch, you created a global constant called NUM WORKERS, and in the data.dataloader implementation , you hard coded it to be 1.

To fix this, make sure your script looks something like this:

python Copy code import torch from torch.utils.data import DataLoader

Your other imports and code here


def main():
    # Your training or data loading code here
    # Example:
    # dataset = YourDataset()
    # NUM_WORKERS = os.cpu_count()
    # dataloader = DataLoader(dataset, num_workers=NUM_WORKERS)

    pass

if __name__ == '__main__':
    main()

heisdenverr avatar Oct 14 '24 16:10 heisdenverr

Getting same issue on Mac m1, I suspect its because of apple silicon. @lazyman001 are you using a mac book? When I take away the num_workers parameter it works

BATCH_SIZE = 1
train_dataloader = DataLoader(
    dataset=train_data,
    batch_size=BATCH_SIZE,
    num_workers=4,
    shuffle=True
)

benkissi avatar Sep 08 '25 07:09 benkissi