opencv_contrib
opencv_contrib copied to clipboard
DTree classifier crashes without error message
Transferred from http://code.opencv.org/issues/4480
|| Tom Krause on 2015-07-10 07:24
|| Priority: Normal
|| Affected: branch 'master' (3.0-dev)
|| Category: ml
|| Tracker: Bug
|| Difficulty: Medium
|| PR:
|| Platform: x64 / Linux
DTree classifier crashes without error message
When try to use the DTrees in OpenCV 3.0 the "train" method crashes without give an error message.
I tried many configurations, but here is a very simple example.
cv::Mat sampleMat = cv::Mat(10,2, CV_32FC1);
cv::Mat labelMat = cv::Mat(10,1, CV_32SC1);
float* sampleData = (float*)sampleMat.data;
int* labelData = (int*)labelMat.data;
(*sampleData++) = 1.f; (*sampleData++) = 1.f; (*labelData++) = 1;
(*sampleData++) = 1.f; (*sampleData++) = 3.f; (*labelData++) = 1;
(*sampleData++) = 2.f; (*sampleData++) = 8.f; (*labelData++) = 1;
(*sampleData++) = 1.f; (*sampleData++) = 23.f; (*labelData++) = 1;
(*sampleData++) = 2.f; (*sampleData++) = 546.f; (*labelData++) = 1;
(*sampleData++) = 4.f; (*sampleData++) = 1.f; (*labelData++) = 2;
(*sampleData++) = 3.f; (*sampleData++) = 3.f; (*labelData++) = 2;
(*sampleData++) = 4.f; (*sampleData++) = 8.f; (*labelData++) = 2;
(*sampleData++) = 4.f; (*sampleData++) = 23.f; (*labelData++) = 2;
(*sampleData++) = 5.f; (*sampleData++) = 546.f; (*labelData++) = 2;
cv::Ptr<cv::ml::TrainData> trainData = cv::ml::TrainData::create(sampleMat, cv::ml::ROW_SAMPLE, labelMat);
cv::Ptr<cv::ml::DTrees> dtree = cv::ml::DTrees::create();
dtree->train(trainData);
History
Tom Krause on 2015-07-10 07:29
i found a message in the terminal, maybe this helps a little bit:
"terminate called after throwing an instance of 'std::length_error'
what(): vector::reserve"
be rak on 2015-07-10 11:05
i can reproduce it, if we don't set maxdepth, the default value of MAX_INT is used, and it tries to allocate absurd sizes here:
<pre>
int DTreesImpl::addTree(const vector<int>& sidx )
{
size_t n = (params.getMaxDepth() > 0 ? (1 << params.getMaxDepth()) : 1024) + w->wnodes.size();
w->wnodes.reserve(n);
w->wsplits.reserve(n);
w->wsubsets.reserve(n*w->maxSubsetSize);
...
</pre>
Tom Krause on 2015-07-10 11:28
When setting the maxDepth. Then the program crashes without an error message.
So it seems there is another bug.
be rak on 2015-07-10 13:20
yes, true. if CVFolds is left at the default value 10 (or !=0 in general), it crashes in DTreesImpl::calcValue, line 517:
<pre>
for( i = 0; i < n; i++ )
{
int si = _sidx[i];
j = w->cv_labels[si]; k = w->cat_responses[si];
cv_cls_count[j*m + k] += w->sample_weights[si];
}
</pre>
w->cv_labels is not initialized / empty.
the 2.4 branch assigned the cv_labels: https://github.com/Itseez/opencv/blob/2.4/modules/ml/src/tree.cpp#L2749 , this part seems missing.
I'm not sure if this addresses all use cases, but I have solved this in my own code by adding the following lines at the very bottom of DTreesImpl::startTraining( const Ptr<TrainData>& data, int ) in tree.cpp.
void DTreesImpl::startTraining( const Ptr<TrainData>& data, int )
{
...
else
data->getResponses().copyTo(w->ord_responses);
// ---- new code ----
const int n_cv = params.getCVFolds();
if ( n_cv > 0 ) {
int nsamples = (int) w->cat_responses.size();
w->cv_labels.resize( nsamples );
randu( w->cv_labels, 0, n_cv );
}
}