JDA
JDA copied to clipboard
NaN leaf score
Hello,
First of all thank you for open sourcing this code. It is excellent. I'm encountering an error during training where a leaf node can receive a NaN score. After this happens, training freezes. The error has to occur in the following block of code-
if (node_idx >= nodes_n / 2) {
// we are on a leaf node
const int idx = node_idx - nodes_n / 2;
double pos_w, neg_w;
pos_w = neg_w = c.esp;
for (int i = 0; i < pos_n; i++)
pos_w += pos.weights[pos_idx[i]];
for (int i = 0; i < neg_n; i++)
neg_w += neg.weights[neg_idx[i]];
float score = 0.5 * log(pos_w / neg_w);
scores[idx] = isnan(score) ? 0. : score;
return;
}
I added the NaN check above the return myself to work around the issue, but I'm not sure setting the score to 0 is the proper solution. Do you have any insight on better ways to avoid this problem?
@JordanCheney pay attention to the leaf score at first carts, it shouldn't be too large. The problem is caused by the internal node split which may lead to a leaf node has no face sample or non-face sample. The score threshold may be unusually and cause weights to explosion when calculate exp()
I understand that the math says "pure" splits (all face or all non-face) will basically cause the scores to explode. This seems odd to me though as the goal of a tree should be to split the data perfectly no? Of course this should be very difficult to accomplish but still. I suppose this isn't really a bug but a strange artifact of my data. For the record, I was able to get a full cascade to train using my fix above, but the scores weren't comparable to the paper and I'm hoping this is the reason.