opencv_contrib icon indicating copy to clipboard operation
opencv_contrib copied to clipboard

DTree train doesn't classify training data correctly

Open opencv-pushbot opened this issue 10 years ago • 0 comments

Transferred from http://code.opencv.org/issues/4281

|| Siddharth Krishna on 2015-04-17 00:01
|| Priority: Normal
|| Affected: branch '2.4' (2.4-dev)
|| Category: ml
|| Tracker: Bug
|| Difficulty: 
|| PR: 
|| Platform: x64 / Linux

DTree train doesn't classify training data correctly

I need to train a decision tree that completely fits my data. I _want_ it to over-fit. Thus, I don't want it to be pruned, and I want it to grow the tree until every leaf has samples with only one label. Mine is a classification task, with two labels. Here are the params I used:

<pre>
  CvDTreeParams params;
  params.min_sample_count = -1;
  params.regression_accuracy = 0;
  params.use_surrogates = false;
  params.truncate_pruned_tree = false;
  params.cv_folds = 0;
  params.use_1se_rule = false;

</pre>

And here is how I'm training:
<pre>

  cv::Mat trainData(numSamples, dim, CV_32FC1);
  cv::Mat trainLabels(numSamples, 1, CV_32SC1); 

  // ...

  CvDTree* dtree = new CvDTree();

  cv::Mat var_type(newDim + 1, 1, CV_8U);
  // all inputs are numerical                                                                                                                                             
  var_type.setTo(cv::Scalar(CV_VAR_NUMERICAL) );
  // output is categorical                                                                                                                                                
  var_type.at<uchar>(newDim, 0) = CV_VAR_CATEGORICAL;

  dtree->train(trainData, CV_ROW_SAMPLE, trainLabels,
              cv::Mat(), cv::Mat(), var_type, cv::Mat(), params);
</pre>

From the documentation, this should grow a tree that classifies all training data correctly. But on the input attached as samples.txt (each row is one point, last integer on each row is the label), this returns a tree that misclassifies a training point.

History

Vadim Pisarevsky on 2015-04-27 11:11
-   Category set to ml

opencv-pushbot avatar Jul 27 '15 11:07 opencv-pushbot