CUDA out of memory.

Open FiorenzoParascandolo1 opened this issue 11 months ago • 1 comments

Hi, I'm using KANLinear in my own project. I have a problem of CUDA out of memory. Specifically:

the model A uses a unique MLP layer (1 and only 1 MLP layer in the whole network) to map a (160, 8, 197, 197) vector in a (160, 8, 197, 197) vector.
the model B uses a unique KANLinear layer (1 and only 1 KANLinear layer in the whole network) to map a (160, 8, 197, 197) vector in a (160, 8, 197, 197) vector.

The "whole network" is a transformer based on MLP both for model A and model B. The model A uses the 60% of the VRAM of a GPU with 24GB of VRAM, while the second model shows CUDA out of memory problem. Since the difference in the number of parameters for the two models is negligible: the difference is equal to the difference between a single nn.Linear(197, 197) and a single KanLinear(197, 197), how is it possible to have a CUDA out of memory problem?

Jan 15 '25 11:01 FiorenzoParascandolo1

#63 would be a possible fix for your problem.

Mar 14 '25 09:03 redactedontop