pytorch-retinanet icon indicating copy to clipboard operation
pytorch-retinanet copied to clipboard

Issue with encoder.py

Open CluelessIT opened this issue 4 years ago • 1 comments

Would like to clarify the input for _get_anchor_boxes function under encoder.py is a tuple or a int?

As I realised after the code went to line 50 it is not able to assign the values to fm_w and fm_h.

def _get_anchor_boxes(self**, input_size):**
        '''Compute anchor boxes for each feature map.
        Args:
          input_size: (tensor) model input size of (w,h).
        Returns:
          boxes: (list) anchor boxes for each feature map. Each of size [#anchors,4],
                        where #anchors = fmw * fmh * #anchors_per_cell
        '''
        num_fms = len(self.anchor_areas)
        **fm_sizes = [(input_size/pow(2.,i+3)).ceil() for i in range(num_fms)]  # p3 -> p7 feature map sizes**

        boxes = []
        for i in range(num_fms):
            **fm_size = fm_sizes[i]**
            grid_size = input_size / fm_size
            **fm_w, fm_h = int(fm_size[0]), int(fm_size[1])**
            xy = meshgrid(fm_w,fm_h) + 0.5  # [fm_h*fm_w, 2]
            xy = (xy*grid_size).view(fm_h,fm_w,1,2).expand(fm_h,fm_w,9,2)
            wh = self.anchor_wh[i].view(1,1,9,2).expand(fm_h,fm_w,9,2)
            box = torch.cat([xy,wh], 3)  # [x,y,w,h]
            boxes.append(box.view(-1,4))
        return torch.cat(boxes, 0)

I assign input_size as a integer for example 448, and the output of fm_size is a list of elements. Not a list of tuples. So I am confused as to what should be the values inside fm_size. And in general the purpose of doing this encoder.py

If anybody is able to explain to me the purpose of it that would be great! Thank you so much!

CluelessIT avatar Feb 18 '21 09:02 CluelessIT

Would like to clarify the input for _get_anchor_boxes function under encoder.py is a tuple or a int?

As I realised after the code went to line 50 it is not able to assign the values to fm_w and fm_h.

def _get_anchor_boxes(self**, input_size):**
        '''Compute anchor boxes for each feature map.
        Args:
          input_size: (tensor) model input size of (w,h).
        Returns:
          boxes: (list) anchor boxes for each feature map. Each of size [#anchors,4],
                        where #anchors = fmw * fmh * #anchors_per_cell
        '''
        num_fms = len(self.anchor_areas)
        **fm_sizes = [(input_size/pow(2.,i+3)).ceil() for i in range(num_fms)]  # p3 -> p7 feature map sizes**

        boxes = []
        for i in range(num_fms):
            **fm_size = fm_sizes[i]**
            grid_size = input_size / fm_size
            **fm_w, fm_h = int(fm_size[0]), int(fm_size[1])**
            xy = meshgrid(fm_w,fm_h) + 0.5  # [fm_h*fm_w, 2]
            xy = (xy*grid_size).view(fm_h,fm_w,1,2).expand(fm_h,fm_w,9,2)
            wh = self.anchor_wh[i].view(1,1,9,2).expand(fm_h,fm_w,9,2)
            box = torch.cat([xy,wh], 3)  # [x,y,w,h]
            boxes.append(box.view(-1,4))
        return torch.cat(boxes, 0)

I assign input_size as a integer for example 448, and the output of fm_size is a list of elements. Not a list of tuples. So I am confused as to what should be the values inside fm_size. And in general the purpose of doing this encoder.py

If anybody is able to explain to me the purpose of it that would be great! Thank you so much!

@kuangliu

CluelessIT avatar Feb 18 '21 09:02 CluelessIT