opencv_contrib
opencv_contrib copied to clipboard
Add fast grayscale morphology using sparse table data structure
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Abstract
This is a pull request for a program that calculates grayscale morphological operations quickly by using a data structure called a sparse table.
About sparse table
1D sparse table
For the data structure, see this: https://www.geeksforgeeks.org/sparse-table/
It is possible to quickly find the minimum (maximum) value of an array within a range whose length is a power of two.
Let's call the nth row of the sparse table the "row of depth n". The nth row contains the minimum (maximum) values in consecutive subsequences of length $2^n$ of the original array.
| sparse tabl(min) | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| depth0(original array) | 1 | 4 | 9 | 2 | 5 | 0 | 5 | 7 | 3 |
| depth1 | 1 | 4 | 2 | 2 | 0 | 0 | 5 | 3 | - |
| depth2 | 1 | 2 | 0 | 0 | 0 | 0 | - | - | - |
| depth3 | 0 | 0 | - | - | - | - | - | - | - |
Let us denote by $F$ the operation of computing the next row from a row in a sparse table.
$st[0] = original~array$ $st[depth + 1] = F(st[depth])$
2D sparse table
For the data structure, see this: https://www.geeksforgeeks.org/2d-range-minimum-query-in-o1/
Change the order of the subscripts to:
$st[dr][dc][r][c]$
where $dr$ is the depth in the row direction, $dc$ is the depth in the column direction, $r$ is the index in the row direction, and $c$ is the index in the column direction. $st[dr][dc]$ is a two-dimensional table that lists the minimum (maximum) values within a rectangle R of height $2^{dr}$ and width $2^{dc}$ that is slid across the original array A. In other words, it is the result of a morphological operation when A is a grayscale image and R is the structuring element.
In terms of the depth (dr, dc) of a sparse table, the operation of increasing the depth by +1 in the row and col directions will be denoted as Fr and Fc, respectively.
$st[0][0] = original~image$ $st[dr+1][dc] = F_r(st[dr][dc])$ $st[dr][dc+1] = F_c(st[dr][dc])$
Sparse table morphology
Morphological operations are performed in the following steps:
- Prepare a set of rectangles whose width and height are powers of 2 to cover the structuring element
- Group the rectangles by height and width.
- Prepare a MAT DST to accumulate the results
- For each group, do the following
- Create st[dr][dc] that corresponds to the size of this group
- For each rectangle in the group, do the following
- Shift st[dr][dc] by the position of each rectangle, and calculate min/max against DST.
At step 1, I implemented a method to find corners in a sparse table of structuring elements. I don't know if it's the best method, but it's giving good results.
At step 4-1, st[dr][dc] is created by reusing st[dr'][dc'] of smaller size. There are several possible reuse strategies, but I implemented a method that minimizes the total number of calculations of Fr and Fc. This is reduced to the Rectilinear Steiner Arborescence Problem. (https://link.springer.com/article/10.1007/BF01758762).
Examples
With a structuring element of MORPH_ELLIPSE, size(5, 5)
The structuring element is decomposed into six rectangles. The result of grouping in step 2 is as follows:
- Size(4, 2): Point(0, 1), Point(1, 1), Point(0, 2), Point(1, 2)
- Size(1, 4): Point(2, 0), Point(2, 1)
For size(4, 2), we need to calculate st[1][2], and for size(1, 4), we need to calculate st[2][0].
- st[0][0] =: original image
- st[1][0] =: Fr(st[0][0])
- st[1][1] =: Fc(st[1][0])
- st[1][2] =: Fc(st[1][1])
- shift & min/max process for Point(0, 1), Point(1, 1), Point(0, 2), Point(1, 2)
- st[2][0] =: Fr(st[1][0])
- shift & min/max process for Point(2, 0), Point(2, 1)
Performance measurement
- Execution environment
- Intel i7-8000 CPU @ 3.20GHz
- 32GB RAM
- image
- lena.png(512x512), CV_8UC3
- conditions
- cv_RECT: MORPH_RECT, cv_erode
- cv_CROSS: MORPH_CROSS, cv_erode
- cv_ELLIPSE: MORPH_ELLIPSE, cv_erode
- st_RECT: MORPH_RECT, sparse_table_erode
- st_CROSS: MORPH_CROSS, sparse_table_erode
- st_ELLIPSE: MORPH_ELLIPSE, sparse_table_erode
When kernel is decomposited in advance:
From the graph, time complexity against diameter of kernel are as follows:
- cv_RECT and cv_CROSS: O(k)
- cv_ELLIPSE: O(k^2)
- st_RECT and st_CROSS: O(k^0.5)
- st_ELLIPSE: O(k^0.76)
Including decomposition time:
When the kernel size approaches the same size as the target image, the kernel decomposition time becomes comparable to that of CV_RECT.