Weighted-Boxes-Fusion icon indicating copy to clipboard operation
Weighted-Boxes-Fusion copied to clipboard

WBF on Polygon instead of Bounding Box

Open innat opened this issue 5 years ago • 14 comments
trafficstars

I've read this algorithm utilizes confidence scores of all proposed bounding boxes to constructs the averaged boxes. However, I'm interested to know, is it possible to utilize the algorithm with an arbitrarily number of polygons, such as in the scene text spotting problem? I hope you get my point. Any pointer? Thanks.

innat avatar Sep 09 '20 23:09 innat

You need the following 2 functions:

  1. Easy one: get intersection over union (IoU) metric for two poligons.
  2. Harder: get weighted average of several polygons. I think its not that easy but solvable. You can start with non-weighted average of 2 polygons.

With these 2 functions it's easy to implement WBF over polygons. I can help with that if you provide these 2 functions )

ZFTurbo avatar Sep 10 '20 20:09 ZFTurbo

Actually it was interesting question. I found some answer: https://gis.stackexchange.com/questions/68359/creating-average-polygon/68617

But polygons are complicated and sometimes it's hard to say what is average for 2 polygons.

ZFTurbo avatar Sep 11 '20 08:09 ZFTurbo

Thanks for your valuable comment. And thanks for the above link. However, it's actually hard to implement a precise function to get the weighted average of several polygons. However, here is the easy one:

from shapely.geometry import Polygon

def cal_poly_iou(polA_xy, polB_xy):
    pol_1 = Polygon(polA_xy)
    pol_2 = Polygon(polB_xy)

    pol_intersection = pol_1.intersection(pol_2).area
    pol_union        = pol_1.union(pol_2).area
    
    return pol_intersection/pol_union , pol_1, pol_2

innat avatar Sep 13 '20 08:09 innat

First idea for the average of two polygons that comes to mind:

  1. We take as a basis a polygon with a greater probability (will be number 1)
  2. For each point of this first polygon, we take the nearest point of the second polygon. Find the middle of this segment. We put a point for the summary polygon in this place.
  3. If we use weights, then the point will not be placed in the middle of the segment but shifted towards the polygon with a large weight.
  4. If you need to find the average for a large number of polygons, combine them sequentially one by one. Adding each next to "heavier" one.

This algorithm shouldn't broke not convex polygons like letters.

ZFTurbo avatar Sep 14 '20 12:09 ZFTurbo

@innat do you have some polygons you want to esnemble? I mean to prepare some small benchmark for testing.

ZFTurbo avatar Sep 15 '20 20:09 ZFTurbo

@ZFTurbo I don't have yet, but I can make it. However, how about following polygons from the above link that you gave!

shapes = [
    {'type': 'Polygon', 'coordinates': [[(1095.76, 278.06), (1095.76, 278.06), (1228.25, 301.98), (1377.29, 301.98), (1511.62, 283.58), (1603.62, 254.14), (1669.86, 224.7), (1737.95, 175.02), (1772.91, 129.01), (1791.31, 77.49), (1804.19, -1.63), (1796.83, -53.15), (1776.59, -121.24), (1726.91, -198.52), (1629.38, -303.4), (1491.38, -413.81), (1215.37, -575.73), (764.55, -809.42), (617.34, -883.03), (508.78, -929.03), (431.5, -951.11), (210.69, -965.83), (135.24, -938.23), (111.32, -888.55), (96.6, -783.66), (126.04, -619.9), (194.13, -469.01), (295.33, -296.04), (381.81, -150.68), (501.42, -20.03), (630.22, 83.01), (771.91, 167.66), (924.63, 232.06), (1027.68, 261.5), (1095.76, 278.06)]]},
    {'type': 'Polygon', 'coordinates': [[(1865.28, 145.78), (1865.28, 145.78), (1779.15, 286.31), (1629.55, 381.5), (1425.57, 438.17), (1226.11, 435.9), (1037.99, 404.17), (829.46, 306.71), (657.21, 170.72), (548.41, 32.46), (466.82, -87.67), (328.56, -407.25), (287.76, -559.11), (287.76, -731.37), (321.76, -869.63), (385.22, -944.42), (480.42, -967.09), (729.74, -971.62), (913.33, -917.23), (1144.51, -806.17), (1432.37, -647.51), (1659.02, -482.05), (1819.94, -302.99), (1908.34, -117.14), (1901.54, 14.32), (1865.28, 145.78)]]},
    {'type': 'Polygon', 'coordinates': [[(1175.76, 247.32), (1175.76, 247.32), (1336.5, 258.21), (1450.92, 251.4), (1550.36, 229.61), (1645.71, 195.55), (1724.72, 150.6), (1758.78, 111.1), (1777.85, -19.67), (1765.59, -71.44), (1709.74, -157.25), (1603.49, -258.06), (1463.18, -362.95), (1181.21, -504.61), (524.63, -841.08), (305.32, -965.04), (211.33, -1007.26), (-21.61, -1049.49), (-82.91, -1034.51), (-111.51, -975.93), (-111.51, -857.42), (-86.99, -745.72), (50.59, -505.98), (143.22, -332.98), (290.33, -165.43), (470.14, -30.57), (659.49, 78.41), (881.52, 175.12), (1044.99, 224.16), (1175.76, 247.32)]]},
    {'type': 'Polygon', 'coordinates': [[(886.58, 201.11), (886.58, 201.11), (1106.77, 271.57), (1249.89, 286.98), (1430.44, 286.98), (1531.73, 267.16), (1694.67, 205.51), (1760.72, 152.67), (1789.35, 106.43), (1798.15, 33.77), (1767.33, -107.15), (1613.2, -292.11), (1386.41, -450.64), (1150.81, -569.54), (710.44, -802.94), (441.81, -961.47), (325.11, -1020.92), (223.83, -1045.14), (49.88, -1067.16), (-16.18, -1047.35), (-27.19, -992.3), (-38.2, -913.03), (-16.18, -805.14), (32.26, -655.42), (175.38, -408.81), (340.52, -148.99), (494.65, -10.27), (688.42, 117.44), (813.92, 176.89), (886.58, 201.11)]]},
    {'type': 'Polygon', 'coordinates': [[(802.94, 60.03), (802.94, 60.03), (1012.93, 172.53), (1195.43, 230.02), (1370.42, 257.52), (1510.41, 250.02), (1610.41, 227.52), (1697.91, 195.02), (1755.41, 147.53), (1785.4, 102.53), (1795.4, 32.53), (1800.4, -57.47), (1790.4, -119.96), (1720.41, -227.46), (1585.41, -354.95), (1312.92, -552.45), (1055.43, -707.44), (730.45, -899.93), (540.45, -1009.93), (400.46, -1034.93), (275.46, -1044.93), (225.47, -1024.93), (197.97, -939.93), (200.47, -817.43), (272.96, -632.44), (367.96, -424.95), (472.96, -244.96), (612.95, -84.96), (752.94, 22.53), (802.94, 60.03)]]},
]

innat avatar Sep 15 '20 20:09 innat

It's too easy case with convex polygons. We need some letters I guess.

ZFTurbo avatar Sep 15 '20 20:09 ZFTurbo

Great. Now, it's midnight here. I will prepare something by tomorrow and share it with you. -)

innat avatar Sep 15 '20 21:09 innat

@ZFTurbo I've picked 10 samples from the Total-Text data data set. Inference on them using MaskTextSpotterV3 model and saved 3 predicted outcomes of each (so total 30). This may not much but I think ok for now. Here is the link. Please have a look.

It contains the following folders:

  • Img (10 samples)
  • gt (10 text files, ground truth)
  • pred

pred contains another 10 folders (name according to the image). Each folder contains 3 predictions. However, I've attached a notebook (total_text_10_EDA.ipynb) that contains basic EDA on these files.

innat avatar Sep 16 '20 10:09 innat

Thank you. I downloaded data. Will look into it.

ZFTurbo avatar Sep 25 '20 09:09 ZFTurbo

Hello @ZFTurbo , may I ask you about the state of this issue? I want to use WBF on Polygon in my project so I would like to participate to finish implementation.

DavidAdamczyk avatar Oct 13 '21 08:10 DavidAdamczyk

@DavidAdamczyk there were 2 guys who tried to solve that problem. Unfortunately they didn't make it.

If you speak russian, I made formalization of task (with some ideas and links): https://docs.google.com/document/d/1pFVgtxq4_HW3i62if660EPM2xqn7VwekqT1nqGn31j0/edit

ZFTurbo avatar Oct 13 '21 09:10 ZFTurbo

Thank you! That sounds great! I am starting work on this task already.

DavidAdamczyk avatar Oct 13 '21 10:10 DavidAdamczyk

That would be nice if you share your progress in process. I can help with some parts. If you have account at ODS.ai (https://ods.ai/join-community) you can find me there: zfturbo.

ZFTurbo avatar Oct 13 '21 10:10 ZFTurbo