queen icon indicating copy to clipboard operation
queen copied to clipboard

ValueError: batch_size should be a positive integer value, but got batch_size=0

Open michnaugh1 opened this issue 7 months ago • 4 comments

Hi @amritamaz and team that worked on Queen. Such exciting technology. Thank you very much for your efforts.

I am trying to run train.py on custom data and I receive the following error:

Traceback (most recent call last): File "/home/location/queen/train.py", line 1193, in training(lp_args, op, pp_args, qp_args, args.test_iterations, args.save_iterations, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/location/queen/train.py", line 99, in training train_loader = iter(torch.utils.data.DataLoader(train_image_dataset, batch_size=train_image_dataset.n_cams, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/location/miniconda3/envs/queen/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 382, in init batch_sampler = BatchSampler(sampler, batch_size, drop_last) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/location/miniconda3/envs/queen/lib/python3.12/site-packages/torch/utils/data/sampler.py", line 323, in init raise ValueError( ValueError: batch_size should be a positive integer value, but got batch_size=0

Not sure why I'm getting this error. I was able to build on CUDA 12.4 running on Pop!_OS 22.04 on a RTX 4090.

Has anyone else encountered? Any fixes I should consider?

Thanks so much!

michnaugh1 avatar Jun 13 '25 00:06 michnaugh1

Could be related to #6 ?

joel-evercoast avatar Jun 13 '25 02:06 joel-evercoast

I have seen this when the dataset_reader does not successfully load training images (you can check this by checking len(train_image_dataset) before line 99). Can you check your dataset organization against the updated README (fixes #6) and check if your train_image_dataset has been populated correctly?

amritamaz avatar Jun 13 '25 04:06 amritamaz

Thanks so much. I did have an issue with my directory structure. I did not have images directories within each of the cameras. After running train.py and framer iteration completes, I receive a CUDA out of memory issue. I'm running this on a 4090 card. Is there a way to optimize so I can run without CUDA error on this card?

michnaugh1 avatar Jun 17 '25 15:06 michnaugh1

Hi @michnaugh1 , can you paste the full error trace? Where do you get the error exactly? I would expect the code would be able to run on a 4090.

amritamaz avatar Jul 08 '25 18:07 amritamaz