unshred icon indicating copy to clipboard operation
unshred copied to clipboard

Write a feature detector to find fragments of lines on shreds.

Open dchaplinsky opened this issue 11 years ago • 2 comments

We need a feature detector that accepts shred and tries to determine fragment of lines on it. Proposed algorithm is:

  • Remove pixels on a border of the shred to get rid of false positives.
  • Apply adaptive binarisation to get rid of colour information.
  • Detect lines using Hough transform or similar.
  • Ignore lines which are too short or laying too close. That probably requires some adaptive algorithm and should take into account DPI information. Another fruitful idea might be filtering by histogram of angles. Basically, we are looking for lines to find fragments of the table, so we would expect that found lines falls into two buckets, those with angle of X (+/- Y degrees) and those with angle of X+90 (i.e perpendicular). Rest can probably be discarded.
  • Return the list of lines (including angles!)
  • Try to suggest some auto tags like: Has lines (easy one), has parallel lines, has perpendicular lines.

Thanks to @mr-const and @xa4a we have partial solution that needs some refinement.

dchaplinsky avatar Sep 30 '14 23:09 dchaplinsky

For existing solution we are looking for:

  • Improved accuracy
  • Heuristics to suggest some tags
  • Ideally: some way to evaluate algo using ground truth dataset.

Idea of building histogram for angles/lengths to filter out false negatives seems fruitful to me. Also, check PR comments I made the other day.

dchaplinsky avatar Sep 30 '14 23:09 dchaplinsky

Is under development in #8

dchaplinsky avatar Oct 04 '14 23:10 dchaplinsky