DECIMER-Image-Segmentation icon indicating copy to clipboard operation
DECIMER-Image-Segmentation copied to clipboard

Segmentation also of the Compound ID

Open HiteSit opened this issue 5 months ago • 3 comments

Dear Development team,

Locking trought your code i've noticed that there is not an option to segment not only the structures but also the relevant ID that is often present in many patent (more or less with the same style, attached an example). I'm imagining a protocol that segment also the ID and than a simple OCR (pytesseract) or more complex OCR (maybe something based on DL) could recognise the number ID and associate it to the structure. I'm aware of the fact that not in all the patent the ID is present in a constant position (for example sometimes is at 12ptx another times is at 6ptx from the recognised structures. Or sometimes is horizontally and centrated other times is not centrated). But again I can imagine some sort of sample script in which the user input some parameters until is not satisfied of the segmentation.

Before that I start to see if I can do it by myself there is a specific reason why such feature was not implmented and/or what could be the challenges.

Thanks much and terrific work

image

HiteSit avatar Sep 26 '24 11:09 HiteSit