scikit-image icon indicating copy to clipboard operation
scikit-image copied to clipboard

Circular thresholding (e.g. for hue or orientation features)

Open pohlt opened this issue 2 years ago • 22 comments

Description:

Every once in a while, I have the requirement for a thresholding algo which works on a circular feature (such as hue in some color spaces or orientation angle).

Here's a list of algos which do exactly that: https://en.wikipedia.org/wiki/Circular_thresholding (in particular Lai2014)

If you are interested, I could give it a shot.

pohlt avatar Oct 13 '23 10:10 pohlt

The first author even provides source code (no license given). I could ask him if he's ok with including his algo as a Python rewrite.

pohlt avatar Oct 13 '23 12:10 pohlt

Hey @pohlt, welcome! :) Thanks for the suggestion and especially the context around it.

At a first glance this seems like a good fit. From our review guide

In general, we are looking to include algorithms and methods which are established, well documented in the literature and widely used by the imaging community. While this is not a hard requirement, new contributions should be consistent with our mission.

I don't have personal experience with circular thresholding algorithms. But Lai2014's algorithm, which you mention, seems to be the most likely candidate to include. So from my side, feel welcome to go ahead and reach out to the author.

lagru avatar Oct 13 '23 14:10 lagru

I just sent a request to the author.

Do we need some sort of license statement that we can actually use the algorithm in scikit?

pohlt avatar Oct 13 '23 14:10 pohlt

@stefanv would be the expert on licensing issues. But I think ideal would be if the author commented here (or in public somewhere else) that he makes the code available under the BSD-3-Clause to scikit-image.

lagru avatar Oct 13 '23 15:10 lagru

Yukun, the original author, is happy to make his code available. I asked him to post here.

pohlt avatar Oct 14 '23 16:10 pohlt

As an author of the Efficient Circular Thresholding paper, I can confirm that we are happy for the algorithm to be made open source under the BSD-3 clause and included in scikit-image.

yukun-lai avatar Oct 16 '23 19:10 yukun-lai

Thank you @pohlt for reaching out and thank you @yukun-lai for the kind licensing of your code!

I kind of feel bad for thinking about this only now but since the profile, @yukun-lai, is very new I don't know how to verify that you are the actual code author. :see_no_evil:

I'm currently trying to figure out if that's actually necessary and / or how we could do that so the effort on your side is minimal.

lagru avatar Oct 17 '23 13:10 lagru

Dear Prof. Yu-Kun Lai, thank you for agreeing to publish your code under the Modified BSD license. Would you be able to change the copy on your website to have the license included? We can provide the license text, if that would be helpful. This is the easiest way for us to verify the licensing intent.

Alternatively, an email to stefanv at berkeley.edu with the statement you made above, sent from your university email address, will suffice.

stefanv avatar Oct 17 '23 16:10 stefanv

Dear Prof. Yu-Kun Lai, thank you for agreeing to publish your code under the Modified BSD license. Would you be able to change the copy on your website to have the license included? We can provide the license text, if that would be helpful. This is the easiest way for us to verify the licensing intent.

Alternatively, an email to stefanv at berkeley.edu with the statement you made above, sent from your university email address, will suffice.

I can see your concern. I used to have an old account but can't seem to access it. Just sent the email using my University email address.

yukun-lai avatar Oct 17 '23 16:10 yukun-lai

Hi @stefanv, just let me know when you got the mail.

I'm currently busy, so there's no rush with the license topic.

pohlt avatar Oct 23 '23 08:10 pohlt

From: Yukun Lai <LaiY4@cardiff...> To: "stefanv@berkeley" <stefanv@berkeley...> Subject: circular thresholding code Date: Wed, 18 Oct 2023 09:04:11 +0000 Message-ID: LO0P265MB6053BA9C84B2E292204B9479B2D5A@LO0P265MB6053.GBRP265.PROD.OUTLOOK.COM

Dear Stefan,

As an author of the Efficient Circular Thresholding paper, I can confirm that we are happy for the algorithm to be made open source under the BSD-3 clause and included in scikit-image.

Best regards, Yukun

stefanv avatar Oct 23 '23 16:10 stefanv

Thank you, @yukun-lai 🙏

stefanv avatar Oct 23 '23 16:10 stefanv

A first implementation is ready: https://github.com/pohlt/scikit-image/blob/co/__co/co.ipynb

There is a naive implementation doing a lot of redundant calculations. Another version ("less") avoids some calculations of sigma. Yet another version ("updater") does something like a rolling update of the values similar to Yukun's code. The performance on my laptop is about 1-3 ms for all three versions. The updater version is typically the fastest. If you run the notebook locally, you can even play around with a few sliders (based on ipywidgets).

Any feedback is highly appreciated!

pohlt avatar Nov 03 '23 12:11 pohlt

Could you please give me some guidance concerning the interface?

Similar to the signature of the linear otsu call I would like to preserve the choice to either provide an image or an already calculated histogram. Both options have their challenges:

  • image: We either need to know the value range (minimum and maximum) to correctly calculate the histogram (e.g. [0, 2*pi) ) or we assume that the input is already normalized such that the range is [0, 1) .
  • histogram: Linear otsu accepts histograms with bins of varying size which makes it difficult to correctly "wrap" the histogram. How should we handle this in the context of circular thresholding?

pohlt avatar Nov 27 '23 15:11 pohlt

A first implementation is ready: https://github.com/pohlt/scikit-image/blob/co/__co/co.ipynb

Thanks so much for working on this. I have a few questions:

  • These functions currently take in a histogram (x, h) correct?
  • I am noticing that the plots only display "naive circular otsu" in their legend? From your code it seems like you want to display the "less" and "updater" version also.

Similar to the signature of the linear otsu call I would like to preserve the choice to either provide an image or an already calculated histogram. Both options have their challenges: [...]

I am not so sure if this is good API design. But to answer your question, it seems that for the linear case that problem is currently solved by https://github.com/scikit-image/scikit-image/blob/8f864325214ac7a9876b07d80e89815a7ab4d179/skimage/filters/thresholding.py#L387

would that work for your case as well? For the image case, I'd just document how the minimum and maximum are determined. And if the user want's something different, he's supposed to pass in the histogram themselves. Concerning the "bins of varying size", for now I'd put in a check that raises an error if bins are not of the same size. Once we get to reviewing (and in my case understanding the actual algorithm) we may see an opportunity to address this.

If you feel ready I'd encourage you to make a draft PR. :D

lagru avatar Nov 28 '23 14:11 lagru

The notebook was just a proof of concept, not the actual API I have in mind. Sorry for the confusion.

  • histogram: I like the check for constant bin sizes, so let's do it like this.
  • image: If we just take the minimum and maximum value found in the image for the total range, the results will be broken most of the time. Forcing the user to normalize the range to [0, 1) increases the required computations. I'd propose to have additional parameter(s) (min_val/max_val or range with a tuple) which the user has to provide if she wants the histogram to be calculated.

What do you think?

pohlt avatar Nov 28 '23 16:11 pohlt

Adding additional parameters if the histogram is to be calculated is something that we could totally do. Could you explain why the algorithm needs to know these min and max values to be correct in most cases? Is that to correctly judge the distance between the two bins that wrap around from the end to the start of the histogram / value range?

lagru avatar Nov 28 '23 19:11 lagru

Sure, let's assume you calculated the hue values of an image which are often in the range [0, 2pi) and you want to use those hue values to do circular thresholding. The image most likely does not contain exactly 0 or 2pi, so taking the min/max values of the image to calculate the histogram would be (slightly) off.

Therefore, to calculate a correct circular thresholding value you have to specify the exact range.

Does that make sense?

On 28 November 2023 20:06:03 CET, "Lars Grüter" @.***> wrote:

Adding additional parameters if the histogram is to be calculated is something that we could totally do. Could you explain why the algorithm needs to know these min and max values? Is that to correctly judge the distance between the two bins that wrap around from the end to the start of the histogram / value range?

-- Reply to this email directly or view it on GitHub: https://github.com/scikit-image/scikit-image/issues/7205#issuecomment-1830503883 You are receiving this because you were mentioned.

Message ID: @.***>

pohlt avatar Nov 28 '23 20:11 pohlt

Dear all,

Sorry for my late reply.

For the histogram input, I think it makes sense to only handle cases with uniform bins? For the image input, it is possible to take as input the min/max values (which are needed, unless we assume the range of values is normalised). For image input, I suppose we also need to know the bin size or the number of bins, especially if the input image has pixel values as real numbers?

In terms of different implementations, they should give identical results. The updater version is significantly faster when there are more bins (16-bit case rather than 256 bins).

Are we also considering the case with multiple thresholds (more than 2)?

Best regards, Yukun

On Tue, Nov 28, 2023 at 8:25 PM Tom Pohl @.***> wrote:

Sure, let's assume you calculated the hue values of an image which are often in the range [0, 2pi) and you want to use those hue values to do circular thresholding. The image most likely does not contain exactly 0 or 2pi, so taking the min/max values of the image to calculate the histogram would be (slightly) off.

Therefore, to calculate a correct circular thresholding value you have to specify the exact range.

Does that make sense?

On 28 November 2023 20:06:03 CET, "Lars Grüter" @.***> wrote:

Adding additional parameters if the histogram is to be calculated is something that we could totally do. Could you explain why the algorithm needs to know these min and max values? Is that to correctly judge the distance between the two bins that wrap around from the end to the start of the histogram / value range?

-- Reply to this email directly or view it on GitHub:

https://github.com/scikit-image/scikit-image/issues/7205#issuecomment-1830503883

You are receiving this because you were mentioned.

Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/scikit-image/scikit-image/issues/7205#issuecomment-1830668132, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDI4FRTL4RHBZ7JULZS5V2DYGZCEHAVCNFSM6AAAAAA566RXBWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZQGY3DQMJTGI . You are receiving this because you were mentioned.Message ID: @.***>

yukun-lai avatar Jan 11 '24 16:01 yukun-lai

  • Histogram input: I'm fine with limiting the API to uniform bins
  • Image input: As described above, we need explicit min/max values. Number of bins is certainly desirable in any case, but we could prescribe a default value of 256.
  • Let's get started with a single threshold. A multi-threshold use case would be a different call where we have the same input as in the single-threshold case, but additionally the number of thresholds, so no need to consider for the simple single-threshold case.

Sorry for the slow progress, but other projects leave not much wiggle room for fun stuff like this one here.

pohlt avatar Jan 22 '24 13:01 pohlt

Sorry, guys. Busy times, but I still plan to get this done...

pohlt avatar May 22 '24 07:05 pohlt

Hi,

I think the code is already working correctly. The updater version is fastest (especially when the number of bins is large, e.g. 16-bit images). For two-class cases, we only need min/max values and the number of bins (default can be 0/255, with 256 bins).

Best regards, Yukun

On Wed, May 22, 2024 at 8:58 AM Tom Pohl @.***> wrote:

Sorry, guys. Busy times, but I still plan to get this done...

— Reply to this email directly, view it on GitHub https://github.com/scikit-image/scikit-image/issues/7205#issuecomment-2124120025, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDI4FRWT3MK2INRT22DWX4LZDRF33AVCNFSM6AAAAAA566RXBWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRUGEZDAMBSGU . You are receiving this because you were mentioned.Message ID: @.***>

yukun-lai avatar Jun 05 '24 22:06 yukun-lai