Keras-RFCN
Keras-RFCN copied to clipboard
Training on a simple (shapes) dataset
Hi,
I tried to train on the included dataset.
(Code is from the MaskRCNN repo, I just created the load_bbox
function.)
The problem is that in training time my losses go lower and lower beautifully (on the train and validation set too), but when I would like to test it, the output is disappointing. On a simple dataset like that it should over-fit pretty fast.
class ShapesDataset(Dataset):
def load_shapes(self, count, height, width):
self.add_class("shapes", 1, "square")
self.add_class("shapes", 2, "circle")
self.add_class("shapes", 3, "triangle")
for i in range(count):
bg_color, shapes = self.random_image(height, width)
self.add_image("shapes", image_id=i, path=None,
width=width, height=height,
bg_color=bg_color, shapes=shapes)
def load_image(self, image_id):
info = self.image_info[image_id]
bg_color = np.array(info['bg_color']).reshape([1, 1, 3])
image = np.ones([info['height'], info['width'], 3], dtype=np.uint8)
image = image * bg_color.astype(np.uint8)
for shape, color, dims in info['shapes']:
image = self.draw_shape(image, shape, dims, color)
return image
def image_reference(self, image_id):
info = self.image_info[image_id]
if info["source"] == "shapes":
return info["shapes"]
else:
super(self.__class__).image_reference(self, image_id)
def get_keys(self, d, value):
return [k for k, v in d.items() if v == value]
def load_bbox(self, image_id):
info = self.image_info[image_id]
shapes = info['shapes']
count = len(shapes)
mask = np.zeros([info['height'], info['width'], count], dtype=np.uint8)
for i, (shape, _, dims) in enumerate(info['shapes']):
mask[:, :, i:i + 1] = self.draw_shape(mask[:, :, i:i + 1].copy(),
shape, dims, 1)
bboxes = []
for i in range(mask.shape[2]):
_, cnts, _ = cv2.findContours(mask[:, :, i] * 255, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnt = sorted(cnts, key=cv2.contourArea, reverse=True)[0]
x, y, w, h = cv2.boundingRect(cnt)
# The right format is:
# (y1, x1, y2, x2)
bboxes.append([y, x, y + h, x + w])
class_ids = np.array([self.class_names.index(s[0]) for s in shapes])
if len(class_ids) != len(bboxes):
raise ValueError("Class ids are not equal with num of bboxes")
return np.array(bboxes), np.array(class_ids)
def draw_shape(self, image, shape, dims, color):
x, y, s = dims
if shape == 'square':
image = cv2.rectangle(image, (x - s, y - s),
(x + s, y + s), color, -1)
elif shape == "circle":
image = cv2.circle(image, (x, y), s, color, -1)
elif shape == "triangle":
points = np.array([[(x, y - s),
(x - s / math.sin(math.radians(60)), y + s),
(x + s / math.sin(math.radians(60)), y + s),
]], dtype=np.int32)
image = cv2.fillPoly(image, points, color)
return image
def random_shape(self, height, width):
shape = random.choice(["square", "circle", "triangle"])
color = tuple([random.randint(0, 255) for _ in range(3)])
buffer = 20
y = random.randint(buffer, height - buffer - 1)
x = random.randint(buffer, width - buffer - 1)
s = random.randint(buffer, height // 4)
return shape, color, (x, y, s)
def random_image(self, height, width):
# bg_color = np.array([random.randint(0, 255) for _ in range(3)])
bg_color = np.array([0, 0, 0], dtype=np.uint8)
shapes = []
boxes = []
N = random.randint(1, 2)
for _ in range(N):
shape, color, dims = self.random_shape(height, width)
shapes.append((shape, color, dims))
x, y, s = dims
boxes.append([y - s, x - s, y + s, x + s])
keep_ixs = Utils.non_max_suppression(
np.array(boxes), np.arange(N), 0.3)
shapes = [s for i, s in enumerate(shapes) if i in keep_ixs]
return bg_color, shapes
All the boxes here had a 0.9 conf or above.
Do you have any idea what causes this? Could you try it out, so we would have a simple "tutorial" not like with the fashion dataset?
Hi, @gaborvecsei It seems that your implementation has no obvious errors. So I will check it out in the next few days. Right now I'm troubled by other projects, so I could not give you some useful advise right now. I'll leave this issue open until, and, Sorry.
Hi @parap1uie-s Thank you for your response! I am looking forward to hear your thoughts on this issue. I hope we can find a solution. Btw, Thank you very much for your effort in this implementation!
I had a little time and I tried to find out what went wrong:
- Where I generate the boxes I need to clip the values between 0 and the image max dims
- Also I need to use
bboxes.astype(np.int32)
- loading a model should be with the
by_name=True
Btw, I have not made these modifications here, but I grabbed the original MaskRCNN implementation and removed the mask layer and inserted the score maps and vote layers
Unfortunately the results are still not the best.
Hi, @gaborvecsei
Sorry for late reply, again.
I will check the models to fix bugs and improve performance, in a few days.