MetaBIN icon indicating copy to clipboard operation
MetaBIN copied to clipboard

Train so slow with single 3060TI ,minumun_number_of_images=3 can change more bigger?

Open huangpan2507 opened this issue 4 years ago • 4 comments

Train so slow, I have trained almost 4 days, the iter is just 103999, can I change the parameter minumun_number_of_images bigger? If i change the parameter , do I need to change a lot code in many position?

huangpan2507 avatar Dec 18 '21 08:12 huangpan2507

minumun_number_of_images from 3 to 6 , Still to slow, need about 6 day to train.

huangpan2507 avatar Dec 20 '21 02:12 huangpan2507

Unfortunately, the number of images for training on the A Large-Scale ReID benchmark exceeds 100,000, which takes a long time to train. We recommend that you do a cross-domain re-id experiment to simply verify your experiment.

In my case, it took me about 2 days to experiment with one 1080ti. For faster results, experiment with larger memory GPUs or use multiple GPUs. However, this code does not yet support multiple GPUs.

bismex avatar Dec 21 '21 02:12 bismex

Unfortunately, the number of images for training on the A Large-Scale ReID benchmark exceeds 100,000, which takes a long time to train. We recommend that you do a cross-domain re-id experiment to simply verify your experiment.

In my case, it took me about 2 days to experiment with one 1080ti. For faster results, experiment with larger memory GPUs or use multiple GPUs. However, this code does not yet support multiple GPUs.

I used one 3060,with 12G. I use the code python3 ./tools/train_net.py --config-file ./configs/Sample/DG-mobilenet.yml #and it train from scratch, it take me almost 6 days

huangpan2507 avatar Dec 21 '21 07:12 huangpan2507

I'm not sure where the bottleneck is. If you want to get results faster, please stop learning early or experiment on a smaller benchmark.

bismex avatar Dec 21 '21 08:12 bismex