Keras-RFCN icon indicating copy to clipboard operation
Keras-RFCN copied to clipboard

Training on a simple (shapes) dataset

Open gaborvecsei opened this issue 5 years ago • 4 comments

Hi,

I tried to train on the included dataset.

(Code is from the MaskRCNN repo, I just created the load_bbox function.)

The problem is that in training time my losses go lower and lower beautifully (on the train and validation set too), but when I would like to test it, the output is disappointing. On a simple dataset like that it should over-fit pretty fast.

class ShapesDataset(Dataset):
    def load_shapes(self, count, height, width):
        self.add_class("shapes", 1, "square")
        self.add_class("shapes", 2, "circle")
        self.add_class("shapes", 3, "triangle")

        for i in range(count):
            bg_color, shapes = self.random_image(height, width)
            self.add_image("shapes", image_id=i, path=None,
                           width=width, height=height,
                           bg_color=bg_color, shapes=shapes)

    def load_image(self, image_id):
        info = self.image_info[image_id]
        bg_color = np.array(info['bg_color']).reshape([1, 1, 3])
        image = np.ones([info['height'], info['width'], 3], dtype=np.uint8)
        image = image * bg_color.astype(np.uint8)
        for shape, color, dims in info['shapes']:
            image = self.draw_shape(image, shape, dims, color)
        return image

    def image_reference(self, image_id):
        info = self.image_info[image_id]
        if info["source"] == "shapes":
            return info["shapes"]
        else:
            super(self.__class__).image_reference(self, image_id)

    def get_keys(self, d, value):
        return [k for k, v in d.items() if v == value]

    def load_bbox(self, image_id):
        info = self.image_info[image_id]
        shapes = info['shapes']
        count = len(shapes)
        mask = np.zeros([info['height'], info['width'], count], dtype=np.uint8)
        for i, (shape, _, dims) in enumerate(info['shapes']):
            mask[:, :, i:i + 1] = self.draw_shape(mask[:, :, i:i + 1].copy(),
                                                  shape, dims, 1)

        bboxes = []
        for i in range(mask.shape[2]):
            _, cnts, _ = cv2.findContours(mask[:, :, i] * 255, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
            cnt = sorted(cnts, key=cv2.contourArea, reverse=True)[0]

            x, y, w, h = cv2.boundingRect(cnt)
            # The right format is:
            # (y1, x1, y2, x2)
            bboxes.append([y, x, y + h, x + w])

        class_ids = np.array([self.class_names.index(s[0]) for s in shapes])

        if len(class_ids) != len(bboxes):
            raise ValueError("Class ids are not equal with num of bboxes")

        return np.array(bboxes), np.array(class_ids)

    def draw_shape(self, image, shape, dims, color):
        x, y, s = dims
        if shape == 'square':
            image = cv2.rectangle(image, (x - s, y - s),
                                  (x + s, y + s), color, -1)
        elif shape == "circle":
            image = cv2.circle(image, (x, y), s, color, -1)
        elif shape == "triangle":
            points = np.array([[(x, y - s),
                                (x - s / math.sin(math.radians(60)), y + s),
                                (x + s / math.sin(math.radians(60)), y + s),
                                ]], dtype=np.int32)
            image = cv2.fillPoly(image, points, color)
        return image

    def random_shape(self, height, width):
        shape = random.choice(["square", "circle", "triangle"])
        color = tuple([random.randint(0, 255) for _ in range(3)])
        buffer = 20
        y = random.randint(buffer, height - buffer - 1)
        x = random.randint(buffer, width - buffer - 1)
        s = random.randint(buffer, height // 4)
        return shape, color, (x, y, s)

    def random_image(self, height, width):
        # bg_color = np.array([random.randint(0, 255) for _ in range(3)])
        bg_color = np.array([0, 0, 0], dtype=np.uint8)
        shapes = []
        boxes = []
        N = random.randint(1, 2)
        for _ in range(N):
            shape, color, dims = self.random_shape(height, width)
            shapes.append((shape, color, dims))
            x, y, s = dims
            boxes.append([y - s, x - s, y + s, x + s])
        keep_ixs = Utils.non_max_suppression(
            np.array(boxes), np.arange(N), 0.3)
        shapes = [s for i, s in enumerate(shapes) if i in keep_ixs]
        return bg_color, shapes

image

All the boxes here had a 0.9 conf or above.

Do you have any idea what causes this? Could you try it out, so we would have a simple "tutorial" not like with the fashion dataset?

gaborvecsei avatar Sep 03 '18 13:09 gaborvecsei

Hi, @gaborvecsei It seems that your implementation has no obvious errors. So I will check it out in the next few days. Right now I'm troubled by other projects, so I could not give you some useful advise right now. I'll leave this issue open until, and, Sorry.

parap1uie-s avatar Sep 11 '18 09:09 parap1uie-s

Hi @parap1uie-s Thank you for your response! I am looking forward to hear your thoughts on this issue. I hope we can find a solution. Btw, Thank you very much for your effort in this implementation!

gaborvecsei avatar Sep 11 '18 10:09 gaborvecsei

I had a little time and I tried to find out what went wrong:

  • Where I generate the boxes I need to clip the values between 0 and the image max dims
  • Also I need to use bboxes.astype(np.int32)
  • loading a model should be with the by_name=True

Btw, I have not made these modifications here, but I grabbed the original MaskRCNN implementation and removed the mask layer and inserted the score maps and vote layers

Unfortunately the results are still not the best.

image

gaborvecsei avatar Sep 14 '18 06:09 gaborvecsei

Hi, @gaborvecsei

Sorry for late reply, again.

I will check the models to fix bugs and improve performance, in a few days.

parap1uie-s avatar Feb 13 '19 13:02 parap1uie-s