ConsistentTeacher icon indicating copy to clipboard operation
ConsistentTeacher copied to clipboard

What is the relationship between samples_per_gpu and workers_per_gpu and sample_ratio?

Open joeyslv opened this issue 2 years ago • 5 comments

I can only use the default

sample_ ratio=[1, 4]
samples_ per_ gpu=4
workers_ per_ gpu=4

But to increase the batchsize a bit, samples_ per_ gpu=8, when the program cannot run and a len() error will occur. Can you tell me the relationship between these three and how labeled and unlabeled data is sampled in the project? Thank you very much

joeyslv avatar May 10 '23 06:05 joeyslv

There are three distinct concepts to understand:

  1. sample_ratio=[1, 4] indicates the ratio of labeled to unlabeled samples within a single GPU. For instance, sample_ratio=[1, 4] means that there is 1 labeled and 4 unlabeled samples on each GPU.
  2. samples_per_gpu=5 refers to the total number of samples per GPU, regardless of whether they are labeled or unlabeled. In fact, sum(sample_ratio) == samples_per_gpu.
  3. workers_per_gpu=5 determines the number of threads that will be used to load the data. The optimal number of workers per GPU depends on your server setup. By default, we set workers_per_gpu equal to samples_per_gpu, but you can reduce this value if your server has limited CPU resources. If necessary, you can set it to 1 or 0.

https://github.com/Adamdad/ConsistentTeacher/blob/1fa64775d93976d9b4ceffa4a2ee7d10a5c50c29/configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py#L256-L296

Adamdad avatar May 10 '23 07:05 Adamdad

Hello, I have been reproducing this paper recently and have encountered the issue you mentioned in another post. Loss=0. I trained on the Coco dataset, and before 10000, the unsup-loss and loss were both 0, but after 10000 the loss is noemal, but soon the loss became 0 again, and unsup_gmm_thr also became 0. Can we discuss this issue with you. image image

X-KL avatar Oct 07 '24 08:10 X-KL

Hello, I have been reproducing this paper recently and have encountered the issue you mentioned in another post. Loss=0. I trained on the Coco dataset, and before 10000, the unsup-loss and loss were both 0, but after 10000 the loss is noemal, but soon the loss became 0 again, and unsup_gmm_thr also became 0. Can we discuss this issue with you. image image

请问您的该问题解决了吗,我也遇到这个问题

huangnana1 avatar Oct 09 '24 06:10 huangnana1

Hello @huangnana1 and @X-KL , how many GPUs are you using and did you change any values in the configs?

Adamdad avatar Oct 09 '24 06:10 Adamdad

hello, thanks for your reply, reference your previous reply in other issues,  the problem I have solved. thank you very much.

---Original--- From: "Xingyi @.> Date: Wed, Oct 9, 2024 14:21 PM To: @.>; Cc: @.@.>; Subject: Re: [Adamdad/ConsistentTeacher] What is the relationship between samples_per_gpu and workers_per_gpu and sample_ratio? (Issue #19)

Hello @huangnana1 and @X-KL , how many GPUs are you using and did you change any values in the configs?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

X-KL avatar Oct 11 '24 01:10 X-KL