vc icon indicating copy to clipboard operation
vc copied to clipboard

Hi,When I train the network, the used GPU memery keep going up?

Open junleiz opened this issue 7 years ago • 4 comments

Hi, Thank for your nice code. It is really a beautiful project. When I am trying to train the dataset refcoco, the used GPU memery keep going up. Initially, the used memery is about 4.8GB. After about 1000 iters, there used memery has been added up to 9.2GB. And I read the code carefully, you feed the net one image in every batch (I am not sure). But why the batchsize is constant, but the GPU memery used keep going up? My GPU is : GTX1080 TI, 11GB. I want to know is this your final code? And have you ever meet this error? Thank you!

junleiz avatar Jun 29 '18 17:06 junleiz

Hi, the problem is that the number of bounding boxes and referring expressions (sentences) varies in different images. You can try to feed only one sentence per iteration, which can help to reduce GPU memory but slightly hurt performance. It should be noted that the batch size in our code is defined over sentences rather than images.

yuleiniu avatar Jun 29 '18 20:06 yuleiniu

Thank you, I fixed it with two 1080Ti

junleiz avatar Jun 30 '18 12:06 junleiz

Hi, I may get some problems when training the unsupervised setting. Everything is fine in the supervised setting. While in unsupervised setting, the result goes wrong: image I train the model follow your command on the refcoco datasets. The printed [Nan ...] is the value of scores_val. Could you please tell me how to fix it?

junleiz avatar Jul 02 '18 14:07 junleiz

I think the problem might be computing log(0) in unsupervised setting. I have modified vc_model.py to prevent this.

yuleiniu avatar Jul 04 '18 20:07 yuleiniu