pykan icon indicating copy to clipboard operation
pykan copied to clipboard

KAN takes significant time to infer by CUDA?

Open hoangthangta opened this issue 9 months ago • 1 comments

I tried to set up a KAN model and found that KAN takes a lot of time to infer the output with CUDA. Here is my test code:

from kan import KAN
import torch
import time

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# without cuda
start = time.time()
model = KAN(width=[768,64,2], grid=5, k=3)
x = torch.normal(0,0.5,size=(4,768))
y = model(x)
end = time.time()
print(end - start) # 3.04 s

# with cuda
start = time.time()
model = KAN(width=[768,64,2], grid=5, k=3, device = device)
x = torch.normal(0,0.5,size=(4,768)).to(device)
y = model(x)
end = time.time()
print(end - start) # 10.9s

What happens? Do you know if I miss something here?

hoangthangta avatar May 14 '24 15:05 hoangthangta

Simply enough, imho, the model isn't large enough and the x passed to the model isn't large enough to appreciate the parallelism as opposed to the overhead of passing that data to CUDA.

AlessandroFlati avatar May 14 '24 15:05 AlessandroFlati

Just fixed a bunch of issues related to cuda and seems cuda runs much faster (20x speed up) than cpu for a [4,100,100,100,1] KAN: https://github.com/KindXiaoming/pykan/blob/master/tutorials/API_10_device.ipynb

KindXiaoming avatar Jul 22 '24 00:07 KindXiaoming