scikit-image icon indicating copy to clipboard operation
scikit-image copied to clipboard

feat: add mask parameter to graycomatrix function for selective computation

Open faisaljayousi opened this issue 1 year ago • 11 comments

mask parameter for graycomatrix function

  • New Feature: Added a mask parameter to the graycomatrix function, enabling selective computation of the gray-level co-occurrence matrix (GLCM).

  • Functionality: The mask allows users to define regions of interest within an image using a binary array, where computations are performed only on the unmasked areas.

  • Documentation & Tests: Updated the function's documentation and modified test_texture.py to reflect the new mask parameter.

faisaljayousi avatar Sep 16 '24 13:09 faisaljayousi

Also, have you looked at this from a performance perspective? Is the new overhead due to a mask (even if not given) significant?

lagru avatar Sep 17 '24 13:09 lagru

Thanks! I gave this a first pass. Seems to me that supporting a mask would make sense.

How did you come about the idea to add this feature. Since I'm not intimately familiar with the algorithm, I'd love to have more context.

I recently used this function on biological images for feature extraction. As is often the case with this type of images, I had to mask out the background (low-intensity pixels) to get meaningful features. Masking helps shift the focus to the relevant regions (non-black pixels in my case here).

Also, have you looked at this from a performance perspective? Is the new overhead due to a mask (even if not given) significant?

I have. Here is the script I used

from skimage.feature import graycomatrix
import numpy as np
import skimage
import timeit


def test_graycomatrix_mask():
    result = graycomatrix(
        image,
        [1, 3, 5],
        [0, np.pi / 4, np.pi / 2, 3 * np.pi / 4],
        levels=levels,
        mask=mask,
    )
    return result


def test_graycomatrix():
    result = graycomatrix(
        image,
        [1, 3, 5],
        [0, np.pi / 4, np.pi / 2, 3 * np.pi / 4],
        levels=levels,
    )
    return result


if __name__ == "__main__":
    print(f"Scikit-image version: {skimage.__version__}")

    number = 100  # number of runs
    shape = (512, 512)
    levels = 3

    image = np.random.randint(0, levels, size=shape, dtype=np.uint8)
    mask = None

    execution_time = timeit.timeit(
        "test_graycomatrix_mask()", globals=globals(), number=number
    )

    # execution_time = timeit.timeit(
    #     "test_graycomatrix()", globals=globals(), number=number
    # )

    print(
        f"Average execution time over {number} runs: {execution_time/number:.6f} seconds"
    )

I got the following runtimes:

Scikit-image version: 0.24.1rc0.dev0
Average execution time over 100 runs: 0.004796 seconds

Scikit-image version: 0.24.0
Average execution time over 100 runs: 0.002971 seconds

That's a runtime increase of 60%. It might be worth creating a separate function to compute the graycomatrix when a mask is provided and encapsulating it into the original scikit-image function. I'm open to any suggestions.

Thanks for taking the time to review this!

faisaljayousi avatar Sep 17 '24 18:09 faisaljayousi

Hey just to keep you in the loop, I'll be traveling next week but I intend to review and get this in after the that. Please feel welcome to ping me if this falls through the cracks and nobody else has taken a look. :wink:

lagru avatar Oct 11 '24 16:10 lagru

Sorry for taking a while. The performance looks good know, there seems to be no penalty when not using a mask.

I think this looks close with only a few places to polish left. 😊

Hey. No worries, thanks for your input on this :). I added a comment that I think is descriptive enough. Let me know what you think.

faisaljayousi avatar Oct 28 '24 09:10 faisaljayousi

Thanks for the updates @faisaljayousi. Unfortunately, I found a significant problem with our optimization in https://github.com/scikit-image/scikit-image/pull/7544#discussion_r1821313407. We'll probably need to revisit the approach. Sorry for leading you on a wild goose chase here! :pray: Perhaps we can think of another solution or a way around this...?

lagru avatar Oct 29 '24 18:10 lagru

I fear that the best solution right now is to duplicate code from _glcm_loop that is only used when mask is not None. That way the unmasked case isn't slowed down by unnecessary checks.

lagru avatar Oct 29 '24 18:10 lagru

Hmm, when I introduce a mask is not None check in the inner loop, like so

for r in range(start_row, end_row):
    for c in range(start_col, end_col):
        ...
        if mask is not None and mask[r, c] and mask[row, col]:
            continue
        ....

I get a 10.3 ms (original) vs 12.7 ms (mask check) slow down for a 1000x1000 image. E.g.

import numpy as np
import skimage as ski
rng = np.random.default_rng(202410291936)
image = rng.integers(0, 255, size=(1000, 1000)).astype(np.uint8)
# Run in ipython
%timeit ski.feature.graycomatrix(image, [1], [0, np.pi / 4, np.pi / 2, 3 * np.pi / 4])

Doesn't seem to bad and might be worth the added support for a masking feature.

lagru avatar Oct 29 '24 18:10 lagru

Hmm, when I introduce a mask is not None check in the inner loop, like so

for r in range(start_row, end_row):
    for c in range(start_col, end_col):
        ...
        if mask is not None and mask[r, c] and mask[row, col]:
            continue
        ....

I get a 10.3 ms (original) vs 12.7 ms (mask check) slow down for a 1000x1000 image. E.g.

import numpy as np
import skimage as ski
rng = np.random.default_rng(202410291936)
image = rng.integers(0, 255, size=(1000, 1000)).astype(np.uint8)
# Run in ipython
%timeit ski.feature.graycomatrix(image, [1], [0, np.pi / 4, np.pi / 2, 3 * np.pi / 4])

Doesn't seem to bad and might be worth the added support for a masking feature.

I think it would be best to leave _glcm_loop() untouched and just duplicate it as you've suggested.

Using today's commit at 16:18UTC (see code below), the observed runtimes are as follows

  • for k=0 (no ignored pixels): 6.78ms
  • for k=256 (all pixels ignored): 1.55ms
  • when mask=None: 4.06ms The average runtime is approximately 4.22 ms in the current live version of the module.
import numpy as np
import skimage as ski
rng = np.random.default_rng(202410301706)
image = rng.integers(0, 255, size=(1000, 1000)).astype(np.uint8)
k = 0
%timeit -n 100 ski.feature.graycomatrix(image, [1], [0, np.pi / 4, np.pi / 2, 3 * np.pi / 4], mask=image<k)

faisaljayousi avatar Oct 30 '24 16:10 faisaljayousi

Hey @lagru. It seems that the same test has failed for the second time. Could this be an issue on my end?

faisaljayousi avatar Nov 28 '24 16:11 faisaljayousi

I think it's unlikely that this is caused by you. I think I've seen that before but it's kinda flaky so nobody has really taken the time yet to debug this. Feel free to ignore this one. If you are on MacOS you are of course welcome to debug this, but that's of course not required! :blush:

lagru avatar Nov 28 '24 22:11 lagru

By the way, I fully intend to get back to this, but other things take priority right now. Sorry! :pray:

lagru avatar Nov 28 '24 22:11 lagru