connected-components-3d icon indicating copy to clipboard operation
connected-components-3d copied to clipboard

Better Algorithm Mk. II

Open william-silversmith opened this issue 5 years ago • 7 comments

Discussed this some in #6

There are several faster algorithms than Wu et al's 2005 decision tree.

  • He et al 2007 describes a way of using a simpler data structure than union-find, but it requires 3x as much memory (3 arrays) for what is probably a 10-20% improvement. They don't compare with Wu et al in their paper.
  • Chang's contour tracing algorithm is very fast and probably shouldn't be ignored. It might be the best for our number of labels (see figure below from Grana).
  • Grana's 2009 paper on 2x2 block based evaluation is promising. It's complex in 2D, and extending it to 3D requires a very complicated tree, though if the chip can handle it efficiently, there's a lot of efficiencies to be gained, maybe even more than in 2D.
image Figure from Grana, Borghesani, Cucchiara. "FAST BLOCK BASED CONNECTED COMPONENTS LABELING". IEEE 2009. doi: 10.1109/ICIP.2009.5413731

william-silversmith avatar May 06 '20 17:05 william-silversmith

It's also worth noting that Wu's algorithm is good enough that we are in a regime where the best known algorithm probably won't gain us more than 2x performance, while parallel could gain us several times more than that.

william-silversmith avatar May 08 '20 23:05 william-silversmith

The block based approach is undermined by multi-label. It depends on being able to summarize entire regions using the knowledge that all foreground pixels will match.

william-silversmith avatar Aug 06 '20 19:08 william-silversmith

It might still be fun to implement block-based for binary images, which are an important class of operation.

william-silversmith avatar Aug 24 '20 05:08 william-silversmith

OpenCV implements Grana et al's Optimized Block Based Decision Tree for binary images (apparently by that team themselves) and I tested it out on a medical YACCLAB dataset. Their algorithm was almost twice as fast, getting 470-493 MVx/sec vs about 250 MVx/sec from mine. Pretty impressive. I had assumed that the ceiling for a good algorithm was more like 20% rather than nearly 100%.

william-silversmith avatar Sep 24 '20 00:09 william-silversmith

Was able to provide a better algorithm for 6-connected which makes me happy. I think the real secret of BBDT is the ability to make 1/4 fewer writes in the first pass. Will have to try that out.

william-silversmith avatar Oct 12 '20 21:10 william-silversmith

BBDT and approaches derived from it are not useful for multilabel, however I suspect that the pixel prediction approach is readily compatible. Create a forest of trees for each situation and use goto to select the correct tree for the next voxel. This can allow the elimination of a lot of duplicate work. 4 and 6 connected are prime candidates for this. 6-connected especially is about 2-3x worse than a null pass, so there's lots of room for improvement.

This technique is described in some detail in this paper: https://prittt.github.io/pub_files/2016acivs.pdf

For multi-label, we would use knowledge of the previous differently labeled pixel to remove pieces from the full decision tree rather than positively select information that is already known.

william-silversmith avatar Jan 16 '21 06:01 william-silversmith

It would be cool if I integrated some of the crazy binary algorithms from YACCLAB. That group is on a tear.

william-silversmith avatar Aug 19 '22 04:08 william-silversmith