opencv_contrib icon indicating copy to clipboard operation
opencv_contrib copied to clipboard

DTree classifier crashes without error message

Open opencv-pushbot opened this issue 10 years ago • 1 comments

Transferred from http://code.opencv.org/issues/4480

|| Tom Krause on 2015-07-10 07:24
|| Priority: Normal
|| Affected: branch 'master' (3.0-dev)
|| Category: ml
|| Tracker: Bug
|| Difficulty: Medium
|| PR: 
|| Platform: x64 / Linux

DTree classifier crashes without error message

When try to use the DTrees in OpenCV 3.0 the "train" method crashes without give an error message.

I tried many configurations, but here is a very simple example.

cv::Mat sampleMat = cv::Mat(10,2, CV_32FC1);
        cv::Mat labelMat = cv::Mat(10,1, CV_32SC1);

        float* sampleData = (float*)sampleMat.data;
        int* labelData = (int*)labelMat.data;

        (*sampleData++) = 1.f; (*sampleData++) = 1.f; (*labelData++) = 1;
        (*sampleData++) = 1.f; (*sampleData++) = 3.f; (*labelData++) = 1;
        (*sampleData++) = 2.f; (*sampleData++) = 8.f; (*labelData++) = 1;
        (*sampleData++) = 1.f; (*sampleData++) = 23.f; (*labelData++) = 1;
        (*sampleData++) = 2.f; (*sampleData++) = 546.f; (*labelData++) = 1;

        (*sampleData++) = 4.f; (*sampleData++) = 1.f; (*labelData++) = 2;
        (*sampleData++) = 3.f; (*sampleData++) = 3.f; (*labelData++) = 2;
        (*sampleData++) = 4.f; (*sampleData++) = 8.f; (*labelData++) = 2;
        (*sampleData++) = 4.f; (*sampleData++) = 23.f; (*labelData++) = 2;
        (*sampleData++) = 5.f; (*sampleData++) = 546.f; (*labelData++) = 2;

        cv::Ptr<cv::ml::TrainData> trainData = cv::ml::TrainData::create(sampleMat, cv::ml::ROW_SAMPLE, labelMat);

        cv::Ptr<cv::ml::DTrees> dtree = cv::ml::DTrees::create();
        dtree->train(trainData);

History

Tom Krause on 2015-07-10 07:29
i found a message in the terminal, maybe this helps a little bit:

"terminate called after throwing an instance of 'std::length_error'
  what():  vector::reserve"
be rak on 2015-07-10 11:05
i can reproduce it, if we don't set maxdepth, the default value of MAX_INT is used, and it tries to  allocate absurd sizes here:

<pre>

int DTreesImpl::addTree(const vector<int>& sidx )
{
    size_t n = (params.getMaxDepth() > 0 ? (1 << params.getMaxDepth()) : 1024) + w->wnodes.size();

    w->wnodes.reserve(n);
    w->wsplits.reserve(n);
    w->wsubsets.reserve(n*w->maxSubsetSize);
    ...

</pre>
Tom Krause on 2015-07-10 11:28
When setting the maxDepth. Then the program crashes without an error message.

So it seems there is another bug.
be rak on 2015-07-10 13:20
yes, true. if CVFolds is left at the default value 10 (or !=0 in general), it crashes in DTreesImpl::calcValue, line 517: 

<pre>
            for( i = 0; i < n; i++ )
            {
                int si = _sidx[i];
                j = w->cv_labels[si]; k = w->cat_responses[si];
                cv_cls_count[j*m + k] += w->sample_weights[si];
            }
</pre>

w->cv_labels is not initialized / empty.

the 2.4 branch assigned the cv_labels: https://github.com/Itseez/opencv/blob/2.4/modules/ml/src/tree.cpp#L2749 , this part seems missing.

opencv-pushbot avatar Jul 27 '15 11:07 opencv-pushbot

I'm not sure if this addresses all use cases, but I have solved this in my own code by adding the following lines at the very bottom of DTreesImpl::startTraining( const Ptr<TrainData>& data, int ) in tree.cpp.

void DTreesImpl::startTraining( const Ptr<TrainData>& data, int )
{

...

    else
        data->getResponses().copyTo(w->ord_responses);

    // ---- new code ----
    const int n_cv = params.getCVFolds();
    if ( n_cv > 0 ) {
        int nsamples = (int) w->cat_responses.size();
        w->cv_labels.resize( nsamples );
        randu( w->cv_labels, 0, n_cv );
    }
}

afriesen avatar Sep 28 '15 18:09 afriesen