ncnn-android-yolov8
ncnn-android-yolov8 copied to clipboard
generate_proposals seg faults on line 155 with a single class model
When running with a single class model, generate_proposals will seg fault on line 155. It's doing so when i=1958. The dimensions of pred are w=8,400 and h = 5. Total float entries are 42,000. It seems to me that the ncnn::Mat.rows() stride is not equal to 5 like you would expect since your grid_strides are also size 8,400.
It seems to me that the math on the output vector will change depending on how many classes there are, and you should not use the ncnn::Mat.rows() function as your only means of traversing it. It is much safer to traverse it as an array taking into account the dimensions of the array.
The only other expectation I have for this behavior is that the array is not contiguous in memory. However, this has been a consistent segmentation fault for me.
I was able to parse the output of the Yolov8 model just fine without much of what you were doing. First of all, the model already does a softmax on the bounding box output, and also converts them to the proper coordinates by multiplying them by the anchors. You can see that it does this in the head with Netron. It also does a sigmoid on the box scores before rejoining them with the bounding boxes in the head. None of this is necessary for you to do.
Now as for the ncnn::Mat.row(), row(0) is your box X center values, row(1) is your Y center values, row(2) is your width values, row(3) is your height values, and row(4)+ is your box scores in order of class. Your program seg faults at the point it does because it leaves the array, and eventually leaves the stack as it goes through far more rows than actually exists. There aren't grid_strides.size() rows, only 4+numclasses rows.
Lucky for you the solution is simply to set your 5 pointers right at the start, and get the number of classes calculated based on the mat.h - 4. After removing the problematic parts, your generate_proposals should look something like this:
void generate_proposals(const ncnn::Mat & pred, std::vector<Object> &objects)
{
const int rowWidth = pred.w;
const int rowHeight = pred.h;
const float *rowX = pred.row(0);
const float *rowY = pred.row(1);
const float *rowW = pred.row(2);
const float *rowH = pred.row(3);
const float *conf = pred.row(4);
const int numClass = rowHeight - 4;
for (int i = 0; i < rowWidth; ++i)
{
int label = 0;
float boxProb = -FLT_MAX;
for (int k = 0; k < numClass; ++k)
{
float confidence = conf[i + rowWidth * k];
if (confidence > boxProb)
{
label = k;
boxProb = confidence;
}
}
if (boxProb >= modelConfidenceThreshold)
{
Object det;
det.rect.x = rowX[i] - 0.5 * rowW[i];
det.rect.y = rowY[i] - 0.5 * rowH[i];
det.rect.width = rowW[i];
det.rect.height = rowH[i];
det.confidence = boxProb;
objects.push_back(det);
}
}
}
See? Nice and simple. I hope you can get your code functional before you add it as an actual example to the Yolov8 repository.
@ChthonicOne, do you have a cleaned up version of this modification in a branch/ PR that works? I am trying your modifications described above and am having Segmentation fault issues still.
I was able to parse the output of the Yolov8 model just fine without much of what you were doing. First of all, the model already does a softmax on the bounding box output, and also converts them to the proper coordinates by multiplying them by the anchors. You can see that it does this in the head with Netron. It also does a sigmoid on the box scores before rejoining them with the bounding boxes in the head. None of this is necessary for you to do.
Now as for the ncnn::Mat.row(), row(0) is your box X center values, row(1) is your Y center values, row(2) is your width values, row(3) is your height values, and row(4)+ is your box scores in order of class. Your program seg faults at the point it does because it leaves the array, and eventually leaves the stack as it goes through far more rows than actually exists. There aren't grid_strides.size() rows, only 4+numclasses rows.
Lucky for you the solution is simply to set your 5 pointers right at the start, and get the number of classes calculated based on the mat.h - 4. After removing the problematic parts, your generate_proposals should look something like this:
void generate_proposals(const ncnn::Mat & pred, std::vector<Object> &objects) { const int rowWidth = pred.w; const int rowHeight = pred.h; const float *rowX = pred.row(0); const float *rowY = pred.row(1); const float *rowW = pred.row(2); const float *rowH = pred.row(3); const float *conf = pred.row(4); const int numClass = rowHeight - 4; for (int i = 0; i < rowWidth; ++i) { int label = 0; float boxProb = -FLT_MAX; for (int k = 0; k < numClass; ++k) { float confidence = conf[i + rowWidth * k]; if (confidence > boxProb) { label = k; boxProb = confidence; } } if (boxProb >= modelConfidenceThreshold) { Object det; det.rect.x = rowX[i] - 0.5 * rowW[i]; det.rect.y = rowY[i] - 0.5 * rowH[i]; det.rect.width = rowW[i]; det.rect.height = rowH[i]; det.confidence = boxProb; objects.push_back(det); } } }See? Nice and simple. I hope you can get your code functional before you add it as an actual example to the Yolov8 repository.
Thanks a lot for your improvement. I'm not familiar with C++, and I was stuck here for a long time. I didn't expect it was the problem of .row(). Thanks for your nice and neat code.