std::bad_alloc when used in std::thread

Open romain-martin opened this issue 2 years ago • 0 comments

First of all, thanks for your work. I integrated your C++ version in my project and it works very well. I'm building Eigen and my project on Linux using C++17, without fancy flags and everything works well. To handle more efficiently several video streams I tried to use std::thread, however I get an issue:

Even if I only create one std::thread, I run into a std::bad_alloc error. In fact the error seems to happen in KalmanFilter.cpp, in the unfreeze() method at l118 double x1 = box1[0]; It seems that after some time (~40 frames processed) new_history.at(secondLastNotNullIndex)[0] is empty. I can't figure why, it only happens when using std::thread.

Here is how I do:

{  
    size_t imageBatchLength = imageBatch.size();
    std::vector<std::future<std::vector<int>>> future_crop;
    // process each images in the dataBatch
    for (int imgIdx = 0; imgIdx < imageBatchLength; ++imgIdx) {
        auto image = imageBatch.operator[]<Image<cv::cuda::GpuMat>>(imgIdx);
        if (_activatedCameras.find(image.getStreamId()) == _activatedCameras.end() ||
            !_activatedCameras.at(image.getStreamId()))
        {
            future_crop.emplace_back(std::future<std::vector<int>>());
        }
        else
        {          
            auto future = std::async(std::launch::async, &TrackingNode::processDetections, this,
                                 image.getMatrix(), detectionsBatch.operator[]<YoloDetections>(imgIdx), image.getStreamId(), false);
            future_crop.emplace_back(std::move(future));
        }       
    }
    for (int imgIdx = 0; imgIdx < imageBatchLength; ++imgIdx) {
        if(!future_crop[imgIdx].valid())
        {
            db.emplaceBack(std::vector<int>());
            continue;
        }
        std::vector<int> trackerBoxes = future_crop[imgIdx].get();
        if(trackerBoxes.empty())
        {
            db.emplaceBack(std::vector<int>());
        }
        else
        {
            db.emplaceBack(trackerBoxes); 
        }
    }
    return db;
}

And the function I call asynchronously is defined here:

std::vector<int> TrackingNode::processDetections(const cv::cuda::GpuMat& image, const YoloDetections& detections,
                                                 const std::string& camId, bool isSubFrame) {
 std::vector<int> indexes; // to save original detections indexes and being able to reassign tracking ids at the right detection
 std::vector<DetectedObject> fullObjectList;
 YoloDetections toTrackObjects;
 for (int i = 0; i < detections.size(); ++i) {
     fullObjectList.push_back(detections[i]);
     if (std::find(_labelFilters[camId].begin(), _labelFilters[camId].end(), fullObjectList[i].getObjectName()) !=
         _labelFilters[camId].end()) {
         toTrackObjects.emplaceDetectedObject(fullObjectList[i]);
         indexes.push_back(i);
     }
 }
 std::vector<int> results(fullObjectList.size(), -1);
 if (_labelFilters.find(camId) == _labelFilters.end())
     return results;
 auto res = _trackers[camId].update(ocsort::Vector2Matrix(toTrackObjects, image, _classes[camId]));
 // assign ids to the right object index
 for (int j = 0; j < res.size(); ++j) {
     results[indexes[j]] = res[j][4];
 }
 return results;
}

Where _trackers is an std::map<std::string,ocsort::OCSort>

So I have a different OCSort object for each thread I launch, however it doesn't seems to be the issue here as I only launch one thread.

What is weird though is that running the same code, with just removing the std::thread part, calling "processDetections" synchronously, everything is fine.

Does someone already faced this issue or have any clue of why this happen? Thank you

Aug 09 '23 15:08 romain-martin