pykan
pykan copied to clipboard
Error AssertionError: Torch not compiled with CUDA enabled with PC only
AssertionError Traceback (most recent call last) Cell In[26], line 10 7 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 8 print(device) ---> 10 model = KAN(width=[15,6,3,1], grid=5, k=3,device=device) #, grid_range=(0,1)) #, seed=0) # noise_scale_base = 0., base_fun = lambda x: x, noise_scale = 0) 12 # model = KAN(width=[12,6,3,1], grid=GBest, k=kBest) #, seed=0) # noise_scale_base = 0., base_fun = lambda x: x, noise_scale = 0)
File c:\Users\thnog\AppData\Local\Programs\Python\Python311\Lib\site-packages\kan\KAN.py:140, in KAN.init(self, width, grid, k, noise_scale, noise_scale_base, base_fun, symbolic_enabled, bias_trainable, grid_eps, grid_range, sp_trainable, sb_trainable, device, seed) 137 for l in range(self.depth): 138 # splines 139 scale_base = 1 / np.sqrt(width[l]) + (torch.randn(width[l] * width[l + 1], ) * 2 - 1) * noise_scale_base --> 140 sp_batch = KANLayer(in_dim=width[l], out_dim=width[l + 1], num=grid, k=k, noise_scale=noise_scale, scale_base=scale_base, scale_sp=1., base_fun=base_fun, grid_eps=grid_eps, grid_range=grid_range, sp_trainable=sp_trainable, 141 sb_trainable=sb_trainable, device=device) 142 self.act_fun.append(sp_batch) 144 # bias
File c:\Users\thnog\AppData\Local\Programs\Python\Python311\Lib\site-packages\kan\KANLayer.py:126, in KANLayer.init(self, in_dim, out_dim, num, k, noise_scale, scale_base, scale_sp, base_fun, grid_eps, grid_range, sp_trainable, sb_trainable, device) 124 self.scale_base = torch.nn.Parameter(torch.ones(size, device=device) * scale_base).requires_grad_(sb_trainable) # make scale trainable 125 else: --> 126 self.scale_base = torch.nn.Parameter(torch.FloatTensor(scale_base).cuda()).requires_grad_(sb_trainable) 127 self.scale_sp = torch.nn.Parameter(torch.ones(size, device=device) * scale_sp).requires_grad_(sp_trainable) # make scale trainable 128 self.base_fun = base_fun
File c:\Users\thnog\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\cuda_init_.py:284, in _lazy_init() ... 286 raise AssertionError( 287 "libcudart functions unavailable. It looks like you have a broken build?" 288 )
AssertionError: Torch not compiled with CUDA enabled
If you have a GPU which supports CUDA, torch.device('cuda' if torch.cuda.is_available() else 'cpu')
will return 'cuda'
.
That said, if you didn't install pytorch with +cu121
or similar, you'll get the error. Please follow pytorch official documentation.
It's a minor issue on KANLayer.py
if isinstance(scale_base, float):
self.scale_base = torch.nn.Parameter(torch.ones(size, device=device) * scale_base).requires_grad_(sb_trainable) # make scale trainable
else:
self.scale_base = torch.nn.Parameter(torch.FloatTensor(scale_base).cuda()).requires_grad_(sb_trainable)
Just remove the .cuda() from the last line. Or add another conditional that maps it to cuda iff device == 'cuda'. (This seems to have been fixed already in master so just git pull again)
It seems that you're using pykan v0.0.3, and the issue you encountered has been fixed in PR #98. However, it hasn't been released yet.
@KindXiaoming, could you please make a new release on the master branch to incorporate the fix and resolve the issue? Thanks in advance!
Hi @Jim137, I think I merged PR #98 yesterday, and pykan v0.0.3 is released after the merge. Please try if it works now, thank you!
Hi @KindXiaoming, Apologies for the confusion. It seems that pykan v0.0.3 doesn't contain PR #98. The last commit in v0.0.3 is 116f399, which predates commit 70b7b8d where the fix was implemented.
Thanks @Jim137 , is now good?
@KindXiaoming, sorry, I mean for users downloading via pypi. The error still occurs.
Here is my test using pypi version:
❯ /home/jim137/git/kan/test/bin/python /home/jim137/git/kan/test1.py
description: 0%| | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/jim137/git/kan/test1.py", line 23, in <module>
model.train(dataset, opt="LBFGS", steps=20)
File "/home/jim137/git/kan/test/lib/python3.11/site-packages/kan/KAN.py", line 898, in train
self.update_grid_from_samples(dataset['train_input'][train_id].to(device))
File "/home/jim137/git/kan/test/lib/python3.11/site-packages/kan/KAN.py", line 243, in update_grid_from_samples
self.forward(x)
File "/home/jim137/git/kan/test/lib/python3.11/site-packages/kan/KAN.py", line 311, in forward
x_numerical, preacts, postacts_numerical, postspline = self.act_fun[l](x)
^^^^^^^^^^^^^^^^^^
File "/home/jim137/git/kan/test/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jim137/git/kan/test/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jim137/git/kan/test/lib/python3.11/site-packages/kan/KANLayer.py", line 176, in forward
y = self.scale_base.unsqueeze(dim=0) * base + self.scale_sp.unsqueeze(dim=0) * y
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
It seems that #98 does not be included in v0.0.3.
Thank you, gotcha! Have released 0.0.4 including new changes.