pykan
pykan copied to clipboard
How to train own dataset for regression?
Hello, how to train own dataset for regression task? I created the dataset in this way to check the regression task.
dataset = {
'train_input':torch.from_numpy(X_train[:3000]),
'test_input': torch.from_numpy(X_test[:2000]),
'train_label':torch.from_numpy(y_train[:3000]),
'test_label':torch.from_numpy(y_test[:2000]),
}
but when I set model to train
model.train(dataset, opt="LBFGS", steps=20, lamb=0.01, lamb_entropy=10.);
it gave me an error:
`File /opt/conda/lib/python3.10/site-packages/kan/LBFGS.py:319, in LBFGS.step(self, closure) 316 state.setdefault('n_iter', 0) 318 # evaluate initial f(x) and df/dx --> 319 orig_loss = closure() 320 loss = float(orig_loss) 321 current_evals = 1
File /opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator.
File /opt/conda/lib/python3.10/site-packages/kan/KAN.py:897, in KAN.train.
IndexError: index 2941 is out of bounds for dimension 0 with size 2000`
Hi, could you please check the shape of your inputs and labels? in particular, dataset['train_label'] should have shape [3000, x]
but looks like it has shape [2000, x]
somehow.
@SuleymanSuleymanzade, it appears that you may have a data slicing issue when creating your dataset. Can you post the shapes of each of your dataset componets, like so?:
print("Train Input Shape:", dataset['train_input'].shape)
print("Train Label Shape:", dataset['train_label'].shape)
print("Test Input Shape:", dataset['test_input'].shape)
print("Test Label Shape:", dataset['test_label'].shape)
I haven't seen this error yet, but the fact that your training and test data appear to contain the same data ([:3000]
and [:2000]
, respectively. I would suggest slicing it this way and see if it works for you:
#Assuming your original dataset is stored as is `df` with all features you need/want:
#Replace slicing with whatever range you want/need.
dataset = {}
train_input, train_label = np.array(df.drop('<target_var>', inplace=True)[:-1000]), np.array(df['<target_var>'][:-1000])
test_input, test_label = np.array(df.drop('<target_var>', inplace=True)[-1000:]), np.array(df['<target_var>'][-1000:])
dataset['train_input'] = torch.from_numpy(train_input)
dataset['train_label'] = torch.from_numpy(train_label.reshape(-1, 1))
dataset['test_input'] = torch.from_numpy(test_input)
dataset['test_label'] = torch.from_numpy(test_label.reshape(-1, 1))