BLIP icon indicating copy to clipboard operation
BLIP copied to clipboard

Continuously increasing RAM with Pre-training

Open abhisheksgumadi opened this issue 1 year ago • 12 comments

Dear Team,

I am using the pre-training script to pre-train BLIP on a custom dataset (containing around 1M image/text pairs).

I see that the machine RAM utilization continuously increases and at a point it reaches 100%. The machine has 120GB RAM!.

Any idea where the problem could be?

image

abhisheksgumadi avatar Jul 05 '22 13:07 abhisheksgumadi

Do you have custom code which could have a memory leak?

woctezuma avatar Jul 05 '22 14:07 woctezuma

We have a a custom dataloder that loads images and text from a parquet file.

abhisheksgumadi avatar Jul 05 '22 14:07 abhisheksgumadi

We have 1 Million images stored on disk and we have prepared the JSON file as described in the Github read me page. The Dataloader we have loads the json file in memory in the __init__ method and then in the __get_item__ method it loads the image from the corresponding path inside the json file. Also returns back the text.

Now sure why the RAM utilization is so high? Any idea please? Thanks

abhisheksgumadi avatar Jul 06 '22 20:07 abhisheksgumadi

Hi, it could be related to the dataloader.

LiJunnan1992 avatar Jul 12 '22 08:07 LiJunnan1992

We ended up using the pretrain_dataset.py file and formatted the data as a json file exactly as mentioned in the readme file. Even then we see the RAM utilization go to 100%. So now we have just formatted the dataset as required with no changes to the code. So we dont even have our own custom code.

abhisheksgumadi avatar Jul 12 '22 08:07 abhisheksgumadi

We are happy to follow any other debugging steps to make this a success please. - thanks

abhisheksgumadi avatar Jul 12 '22 12:07 abhisheksgumadi

Was wondering if there has been any update on this. We ran the pretrain.py and saw the same issue: RAM size increases when the jason files are being read and at some point, RAM explodes. For pretraining, what python version did you use and what was the RAM size?

asgsaeid avatar Nov 30 '22 21:11 asgsaeid

@abhisheksgumadi @asgsaeid You may want to try out our new library which supports BLIP and see if the issue still remains: https://github.com/salesforce/LAVIS

LiJunnan1992 avatar Dec 01 '22 00:12 LiJunnan1992

Thanks, will take a look

abhisheksgumadi avatar Dec 01 '22 00:12 abhisheksgumadi

hope this helps https://ppwwyyxx.com/blog/2022/Demystify-RAM-Usage-in-Multiprocess-DataLoader/

dyashuni avatar Jan 11 '23 04:01 dyashuni

Was wondering if there has been any update on this. We ran the pretrain.py and saw the same issue: RAM size increases when the jason files are being read and at some point, RAM explodes. For pretraining, what python version did you use and what was the RAM size?

Have you solved this problem?Could you kindly provide some suggestions ?

aries-young avatar Jul 03 '23 07:07 aries-young

Thanks, will take a look

Have you solved this problem?Could you kindly provide some suggestions ?

aries-young avatar Jul 03 '23 07:07 aries-young