editsql icon indicating copy to clipboard operation
editsql copied to clipboard

About the GPU resource

Open jiangshibiao opened this issue 4 years ago • 10 comments

Hello! I'm now trying to run this code on CoSQL datasets but meet some problems. I use the default parameters and my 16G GPU get out of memory. I have tried some adjustment but it didn't work. For example, batch_size is set to 16 in the default, and I still get out of memory even I change it to 1.(After I read codes briefly, I found that this parameter will be overwritten by interaction_level. It is always 1 so it didn't work.) Could you please share something you know, like how to reduce the GPU needs, or point out that I have made some mistakes.

jiangshibiao avatar May 10 '20 17:05 jiangshibiao

same concern.

huybery avatar May 13 '20 11:05 huybery

Maybe this problem is caused by CPU's memory is too small. Today I change to another server that has 377G memory, it can run successfully . But GPU is occupied less than 8G ...

yechens avatar Jun 02 '20 07:06 yechens

Maybe this problem is caused by CPU's memory is too small. Today I change to another server that has 377G memory, it can run successfully . But GPU is occupied less than 8G ...

Wow...This sounds shocking, Why a lot of CPU?

huybery avatar Jun 09 '20 06:06 huybery

@ryanzhumich could u give us some valuable guidance for running cosql ?

huybery avatar Jun 09 '20 12:06 huybery

I am running out of memory for sparc as well on 12GB GPU. I don't have an option for larger CPU size either.

Param-Raval avatar Jun 23 '20 13:06 Param-Raval

I am running out of memory for sparc as well on 12GB GPU. I don't have an option for larger CPU size either.

Sparc can run normally, but Cosql will fail with memory out.

huybery avatar Jul 02 '20 06:07 huybery

I am running out of memory for sparc as well on 12GB GPU. I don't have an option for larger CPU size either.

Sparc can run normally, but Cosql will fail with memory out.

@huybery Have you solved this problem? I met OOM on 12GB GPU.

z666pr avatar Jul 09 '20 07:07 z666pr

I am running out of memory for sparc as well on 12GB GPU. I don't have an option for larger CPU size either.

Sparc can run normally, but Cosql will fail with memory out.

@huybery Have you solved this problem? I met OOM on 12GB GPU.

No...It still bothers me

huybery avatar Jul 09 '20 09:07 huybery

Thank you so much for your feedback!

Some interactions can be long to cause OOM issue on GPU. You can skip those interactions like this depending on your hardware. This is more common on the CoSQL train set. It should be fine on test/dev set.

ryanzhumich avatar Jul 09 '20 12:07 ryanzhumich

Thank you so much for your feedback!

Some interactions can be long to cause OOM issue on GPU. You can skip those interactions like this depending on your hardware. This is more common on the CoSQL train set. It should be fine on test/dev set.

I tried running cosql training with 16GB GPU, but even after skipping some interactions as you suggested I still end up getting the issue "RuntimeError: CUDA out of memory. Tried to allocate 48.00 MiB (GPU 0; 15.90 GiB total capacity; 14.79 GiB already allocated;" Could you please help me in resolving this issue.

meghanaraobn2020 avatar Jun 20 '21 18:06 meghanaraobn2020