mmocr icon indicating copy to clipboard operation
mmocr copied to clipboard

Hello, I would like to ask you about the following error when training the dbnet model on the ICDAR2017 dataset。

Open ViviKing414 opened this issue 2 years ago • 5 comments

raise TopologicalError( shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.polygon.Polygon object at 0x7f63038f4520>

ViviKing414 avatar Sep 13 '22 09:09 ViviKing414

Hi, this error may caused due to invalid polygonal bounding boxes with self-intersection included in the annotation of the dataset.

If you are using MMOCR 1.x version, you can directly add our FixInvalidPolygon transform to the training/testing pipeline, which will automatically fixes the invalid shapes.

https://github.com/open-mmlab/mmocr/blob/87f15b3135104db5cd104001a129ca2afe185094/mmocr/datasets/transforms/textdet_transforms.py#L119

If you are using the MMOCR 0.x version, we recommend you to write a script to filter out the invalid samples or fix them, take the following codes as an example.

# if you want to filter out the invalid polygons, simply replace them with [x, y, x+w, y, x+w, y+h, x, y+h]
box = [0, 0, 3, 3]
polygon = Polygon([(0,0), (0,3), (3,3), (3,0), (2,0), (2,2), (1,2), (1,1), (2,1), (2,0), (0,0)])
if not polygon.is_valid:
    x, y, w, h = box
    polygon = [x, y, x+w, y, x+w, y+h, x, y+h]

# or you can mark this sample as ignored
if not polygon.is_valid:
    ignored = 1

xinke-wang avatar Sep 13 '22 10:09 xinke-wang

Thank you very much for your answer. I'll try it out right away

ViviKing414 avatar Sep 14 '22 02:09 ViviKing414

Hi, this error may caused due to invalid polygonal bounding boxes with self-intersection included in the annotation of the dataset.

If you are using MMOCR 1.x version, you can directly add our FixInvalidPolygon transform to the training/testing pipeline, which will automatically fixes the invalid shapes.

https://github.com/open-mmlab/mmocr/blob/87f15b3135104db5cd104001a129ca2afe185094/mmocr/datasets/transforms/textdet_transforms.py#L119

If you are using the MMOCR 0.x version, we recommend you to write a script to filter out the invalid samples or fix them, take the following codes as an example.

# if you want to filter out the invalid polygons, simply replace them with [x, y, x+w, y, x+w, y+h, x, y+h]
box = [0, 0, 3, 3]
polygon = Polygon([(0,0), (0,3), (3,3), (3,0), (2,0), (2,2), (1,2), (1,1), (2,1), (2,0), (0,0)])
if not polygon.is_valid:
    x, y, w, h = box
    polygon = [x, y, x+w, y, x+w, y+h, x, y+h]

# or you can mark this sample as ignored
if not polygon.is_valid:
    ignored = 1

Regarding the MMOCR version, is there any difference between the 1. X version and the mmocr default version?

ViviKing414 avatar Sep 14 '22 04:09 ViviKing414

Excuse me again, when training in version 1.x, ImportError: cannot import name 'register_all_modules' from 'mmocr.utils' 。

ViviKing414 avatar Sep 14 '22 04:09 ViviKing414

Hi,

1.x is the latest major update that MMOCR has released, which introduces many new features (check our latest document for more details English Doc/Chinese Doc). The 0.x version (default version) will be deprecated at the end of this year, hence we recommend users transfer to the 1.x version.

Since the 1.x version relies on the new MMEngine, it does not share the same running env with the 0.x version. You have to install it following our installation guide.

xinke-wang avatar Sep 14 '22 05:09 xinke-wang