ContrastiveSeg icon indicating copy to clipboard operation
ContrastiveSeg copied to clipboard

Extremely slow validation during training + cpu overload

Open davidceka opened this issue 2 years ago • 2 comments

Hi! We're testing your code on our dataset, the training iterations go on smoothly, but the validations in between the iterations are like extremely slow (4000 images in like 6 hours). In addition, every time the model starts validating, the cpu cores (we're testing on a 48 core cpu) all go to 100% even if i set the number of workers to 8 in the json file. Do you have any idea why, or can I provide you with any information that may help address the issue?

davidceka avatar Jul 17 '22 16:07 davidceka

Hi @davidceka, it does not make sense to me that it will cost so much time for validation. I believe that this issue is agnostic with our algorithm, thus I will suggest running a simple model like DeepLab/HRNet to see whether they will work well. Or please try to see whether there are some similar issues in original OpenSeg repo.

tfzhou avatar Jul 18 '22 16:07 tfzhou

@tfzhou Thanks for answering! We're also using another model which implements DeepLab and ResNet and we have good results with that, so I don't really know whats going on with that. I also reached out to your email to add some information (if you could kindly check I'd be the most happy). We thought the issue might have been in the size of the images (which were quite large) but we did a resize to 1024x512 to all of them but with the same outcome.

davidceka avatar Jul 18 '22 19:07 davidceka