pykan icon indicating copy to clipboard operation
pykan copied to clipboard

HELP!CUDA cannot be used with my own dataset

Open AnBydAm opened this issue 6 months ago • 4 comments

My dataset was built based on a.csv file of my own. In the previous version, I could use model.train to train the model and could use cuda normally. But now I'm using the latest pykan0.2.4, where model.train is replaced by model.fit, but I can't use cuda in model.fit,here's my code: 1、I first took the following code to create the dataset from a.csv file: import pandas as pd import torch import numpy as np from sklearn.preprocessing import StandardScaler import os os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8' torch.set_default_dtype(torch.float64) device = 'cuda' data = pd.read_csv('0°冬夏半年辐射能量12000.csv') bp_year_range = (0, 12000) filtered_data = data[(data['BP Year'] >= bp_year_range[0]) & (data['BP Year'] <= bp_year_range[1])] split_index = int(len(filtered_data) * 0.8) duration = filtered_data['Winter year Days'][:split_index].values.reshape(-1, 1) intensity = filtered_data['Winter year Intensity'][:split_index].values.reshape(-1, 1) train_input = np.hstack((duration, intensity)) train_label = filtered_data['Winter year energy '][:split_index].values.reshape(-1, 1) duration_test = filtered_data['Winter year Days'][split_index:].values.reshape(-1, 1) intensity_test = filtered_data['Winter year Intensity'][split_index:].values.reshape(-1, 1) test_input = np.hstack((duration_test, intensity_test)) test_label = filtered_data['Winter year energy '][split_index:].values.reshape(-1, 1) scaler_input = StandardScaler() scaler_label = StandardScaler() scaler_input.fit(train_input) scaler_label.fit(train_label) train_input_scaled = scaler_input.transform(train_input) test_input_scaled = scaler_input.transform(test_input) train_label_scaled = scaler_label.transform(train_label) test_label_scaled = scaler_label.transform(test_label) train_input_tensor = torch.tensor(train_input_scaled, dtype=torch.float64).to(device) train_label_tensor = torch.tensor(train_label_scaled, dtype=torch.float64).to(device) test_input_tensor = torch.tensor(test_input_scaled, dtype=torch.float64).to(device) test_label_tensor = torch.tensor(test_label_scaled, dtype=torch.float64).to(device) dataset = { 'train_input': train_input_tensor, 'test_input': test_input_tensor, 'train_label': train_label_tensor, 'test_label': test_label_tensor } 2、Both model creation and initialization go smoothly: from kan import * model = KAN(width=[2,3,1], grid=5, k=3).to(device) model(dataset['train_input']); model.plot() 3、An error occurred when using model.fit: model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, lamb_entropy=10.);

RuntimeError Traceback (most recent call last) Cell In[4], line 9 1 # 对模型进行训练 2 # 使用 LBFGS 优化器 3 # 进行 20 步训练 (...) 7 # model.train(dataset, opt="LBFGS", steps=20,device=device); 8 # model.to(device) ----> 9 model.fit(dataset, opt="LBFGS", steps=20, lamb=0.01, lamb_entropy=10.)

File ~\pykan-master0.2.4\pykan-master0.24\kan\MultKAN.py:946, in MultKAN.fit(self, dataset, opt, steps, log, lamb, lamb_l1, lamb_entropy, lamb_coef, lamb_coefdiff, update_grid, grid_update_num, loss_fn, lr, start_grid_update_step, stop_grid_update_step, batch, metrics, save_fig, in_vars, out_vars, beta, save_fig_freq, img_folder, singularity_avoiding, y_th, reg_metric, display_metrics) 943 test_id = np.random.choice(dataset['test_input'].shape[0], batch_size_test, replace=False) 945 if _ % grid_update_freq == 0 and _ < stop_grid_update_step and update_grid and _ >= start_grid_update_step: --> 946 self.update_grid(dataset['train_input'][train_id]) 948 if opt == "LBFGS": 949 optimizer.step(closure)

File ~\pykan-master0.2.4\pykan-master0.24\kan\MultKAN.py:358, in MultKAN.update_grid(self, x) 357 def update_grid(self, x): --> 358 self.update_grid_from_samples(x)

File ~\pykan-master0.2.4\pykan-master0.24\kan\MultKAN.py:355, in MultKAN.update_grid_from_samples(self, x) 353 for l in range(self.depth): 354 self.get_act(x) --> 355 self.act_fun[l].update_grid_from_samples(self.acts[l])

File ~\pykan-master0.2.4\pykan-master0.24\kan\KANLayer.py:241, in KANLayer.update_grid_from_samples(self, x, mode) 238 y_eval = coef2curve(x_pos, self.grid, self.coef, self.k) 240 self.grid.data = extend_grid(grid, k_extend=self.k) --> 241 self.coef.data = curve2coef(x_pos, y_eval, self.grid, self.k)

File ~\pykan-master0.2.4\pykan-master0.24\kan\spline.py:176, in curve2coef(x_eval, y_eval, grid, k, lamb) 174 n1, n2, n = XtX.shape[0], XtX.shape[1], XtX.shape[2] 175 identity = torch.eye(n,n)[None, None, :, :].expand(n1, n2, n, n) --> 176 A = XtX + lamb * identity 177 B = Xty 178 coef = (A.pinverse() @ B)[:,:,:,0]

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! The fact is, I made several attempts at debugging, including the utilization of the author's API_10_device.ipynb; however, regrettably, I was unable to resolve the issue. If you have any valuable suggestions, please do not hesitate to share them.

AnBydAm avatar Aug 04 '24 03:08 AnBydAm