simple-faster-rcnn-pytorch 请教下rool pooling过程

请教下rool pooling过程

Open Yuesheng321 opened this issue 5 years ago • 1 comments

trafficstars

上图是我保存rool pooling池化过程中第0个通道的特征图，所有的rois坐标和第0个rois池化后的结果 shang上面三个图分别是特征图，rois和池化后保存的结果。第一个问题：我从第0个rois得到缩小16倍的位置在7,8,9,10列。请问这个位置应该是四舍五入还是取整？第二个问题：假设在四舍五入的情况下，我得到xy的起终点坐标为[21, 20]->[27, 31]（注意excel以1开始）.得到对下面部分的最大池化得到127的大小，我们要池化为77。请问您是怎么对着12划分的呢？比如池化后的第一行，对应特征图的那几行呢？由于rol-pooling是cupy写的，有点看不懂。请问有木有一个公式？

Dec 04 '19 14:12 Yuesheng321

可以查看roi_forward核函数：每一个batch都调用了N * C 7 7个线程并行处理，即每个b_box的每个通道都用49个threads。

首先是对b_box的四条边分别乘以缩放的尺度(spatial_scale)，然后再取四舍五入(round)的结果对应到feature map中

const int roi_start_w = round(bottom_rois[num * 5 + 1] * spatial_scale);  //x_min
const int roi_start_h = round(bottom_rois[num * 5 + 2] * spatial_scale);  //y_min
const int roi_end_w = round(bottom_rois[num * 5 + 3] * spatial_scale);  //x_max
const int roi_end_h = round(bottom_rois[num * 5 + 4] * spatial_scale);  //y_max

然后在特征图谱分别沿宽和高平均分成7份，得到每一份的宽和高

const float bin_size_h = static_cast<float>(roi_height) / static_cast<float>(pooled_height);
const float bin_size_w = static_cast<float>(roi_width) / static_cast<float>(pooled_width);

ph和pw代表每一个小块在7 * 7中的位置，高和宽的起始位置向下取整(floor)，终止位置向上取整(ceil)，即得到了在一个b_box中的相对坐标

int hstart = static_cast<int>(floor(static_cast<float>(ph) * bin_size_h));
int wstart = static_cast<int>(floor(static_cast<float>(pw) * bin_size_w));
int hend = static_cast<int>(ceil(static_cast<float>(ph + 1) * bin_size_h));
int wend = static_cast<int>(ceil(static_cast<float>(pw + 1) * bin_size_w));

再和b_box初始位置在feature map的绝对坐标相加即得到了每个小块在feature map中的位置

int hstart = static_cast<int>(floor(static_cast<float>(ph) * bin_size_h));
int wstart = static_cast<int>(floor(static_cast<float>(pw) * bin_size_w));
int hend = static_cast<int>(ceil(static_cast<float>(ph + 1) * bin_size_h));
int wend = static_cast<int>(ceil(static_cast<float>(pw + 1) * bin_size_w));

最后就是两层for循环得到每个小块内的最大特征值(maxval)

for (int h = hstart; h < hend; ++h)
    {
        for (int w = wstart; w < wend; ++w)
        {
            int bottom_index = h * width + w;
            if (bottom_data[data_offset + bottom_index] > maxval)
            {
                maxval = bottom_data[data_offset + bottom_index];
                maxidx = bottom_index;
            }
        }
    }

每个小块都用一个线层来执行上面的程序，并行以后就得到了一个batch的roi_pool结果

Feb 23 '20 05:02 bwcxa

simple-faster-rcnn-pytorch simple-faster-rcnn-pytorch copied to clipboard

请教下rool pooling过程

可以查看roi_forward核函数：每一个batch都调用了N * C *7 * 7个线程并行处理，即每个b_box的每个通道都用49个threads。

simple-faster-rcnn-pytorch
simple-faster-rcnn-pytorch copied to clipboard

可以查看roi_forward核函数：每一个batch都调用了N * C 7 7个线程并行处理，即每个b_box的每个通道都用49个threads。