Soft-Decision-Tree Is it suitable for regression prediction?

trafficstars

Hello, I'd like to ask if I want to make regression prediction and output_ Dim = = 1, is SDT applicable (it seems to be only used for classification model?)

Thanks!

Feb 23 '22 05:02 zichuan-liu

Hi @775269512, I think it is intuitive to use SDT on regression tasks, simply change the training criterion in main.py, and there should be no need to modify anything inside the implementation of SDT.

Feb 23 '22 07:02 xuyxu

hi, I did a simple experiment. Although the overall loss is decreasing, the output of each sample is the same and cannot be regressed (here out_dim = 1),

x = tensor([[ 1., 1., 1., 1., 1.], [ 2., 2., 2., 2., 2.], [ 3., 3., 3., 3., 3.], [ 4., 4., 4., 4., 4.], [ 5., 5., 5., 5., 5.], [ 6., 6., 6., 6., 6.], [ 7., 7., 7., 7., 7.], [ 8., 8., 8., 8., 8.], [ 9., 9., 9., 9., 9.], [10., 10., 10., 10., 10.]]) and y = np.array([5.56, 5.70, 5.91, 6.40, 6.80, 7.05, 8.90, 8.70, 9.00, 9.05]).ravel()

i got this result tensor([[7.1672], [7.3185], [7.3203], [7.3204], [7.3204], [7.3204], [7.3204], [7.3204], [7.3204], [7.3204]], grad_fn=<MmBackward>) Epoch: 499 | Loss: 1.93230 | Correct: 000/128

I don't know how to change it. It seems that when outdim = 1, the value of each leaf node is the same.

QAQ

Feb 23 '22 08:02 zichuan-liu

Could you show me the code snippet on training and evaluating?

Feb 23 '22 14:02 xuyxu

yep, it's here. In addition, I found that a paper is based on SDT, and I will study it: "SDTR: Soft Decision Tree Regressor for Tabular Data". It is difficult to understand the differentiable decision tree. This is a good interpretable model and I want to use it to do something.

btw, I will also study postgraduate in nju next semester. I find you are my senior~

''' from sklearn.datasets import fetch_california_housing

# Load data
housing = fetch_california_housing()
xs = torch.from_numpy(housing["data"]).float()
ys = torch.from_numpy(housing["target"]).unsqueeze(1).float()

print(xs.size())
print(ys.size())
print(xs)

input_dim = xs.size()[1]
output_dim = ys.size()[1]

# Model and Optimizer
tree = SDT(input_dim, output_dim, depth, lamda, use_cuda)

optimizer = torch.optim.Adam(tree.parameters(),
                             lr=lr,
                             weight_decay=weight_decaly)

# Utils
best_testing_acc = 0.0
testing_acc_list = []
training_loss_list = []
criterion = nn.MSELoss()
device = torch.device("cuda" if use_cuda else "cpu")

output, penalty = tree.forward(xs, is_training_data=True)


for epoch in range(epochs):
    # Training
    tree.train()
    output, penalty = tree.forward(xs, is_training_data=True)
    print(output)

    loss = criterion(output, ys.view(-1))
    # loss += penalty

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print training status
    pred = output.data.max(1)[1]
    correct = pred.eq(ys.view(-1).data).sum()

    msg = (
        "Epoch: {:02d} | Loss: {:.5f} |"
        " Correct: {:03d}/{:03d}"
    )
    print(msg.format(epoch, loss, correct, batch_size))
    training_loss_list.append(loss.cpu().data.numpy())

'''

Feb 23 '22 15:02 zichuan-liu

It looks like you are using the full batch training process (i.e., without using a dataloader that samples batches), maybe you should consider to use one and train SDT in a stochastic way. Besides, what is the value of learning rate and weight decay?

Feb 27 '22 12:02 xuyxu

Soft-Decision-Tree Soft-Decision-Tree copied to clipboard

Is it suitable for regression prediction?

Soft-Decision-Tree
Soft-Decision-Tree copied to clipboard