tedana icon indicating copy to clipboard operation
tedana copied to clipboard

Address divide by zero when getelbow called with only one component

Open handwerkerd opened this issue 4 years ago • 0 comments

Summary

This is a continuation of issue #752 and PR #760. That discussion focused on removing a really uninformative error when getelbow was called on an empty array. I now have a situation where getelbow is called on only one component and it runs, but has a divide by zero warning.

The basic solution to this is extremely easy: The error goes away if getelbow is only run with a minimum of 2 components. I'm creating an issue rather than a PR because there may be some relevant discussion here. One can calculate an elbow of a curve with only two values, but that's essentially meaningless. If we're setting a threshold for a number of components than than which we shouldn't calculate an elbow, what should it be? 3 is mathematically plausible, but still non-ideal. I'm leaning towards "5" while acknowledging that's arbitrary.

Additional Detail

The block of code where I get the divide by zero error is: https://github.com/ME-ICA/tedana/blob/5399ca5368d7fab1a450208d13752c067de8503e/tedana/selection/tedica.py#L261-L269

The error goes away by replacing if not kappas_nonsig.size with if kappas_nonsig.size<=1 but the question is whether it should be something like <=5 instead.

The last PR also added a check and warning in getelbow. We can adjust that too to throw a warning if the number of components are under a threshold rather than just empty: https://github.com/ME-ICA/tedana/blob/5399ca5368d7fab1a450208d13752c067de8503e/tedana/selection/_utils.py#L97-L105

As far as I can tell, getelbow is called in two situations. (1) On most or all of the components. In that situation, this will never be an issue because, if there are only 5 total components, then tedana will have bigger problems. (2) A subset of components where that second elbow threshold is used in conjunction with a threshold on all components. The current approach was to just ignore that second threshold when there are zero components.

Another approach is that the second threshold on non-significant components can only lower the kappa elbow and result in accepting more components and T2* signal. Excluding this option if there are less than X components means that denoising may be slightly more aggressive.

Next Steps

  • Decide on the appropriate threshold for setting an elbow to nan rather than using just a few components to calculate an elbow. Options:

    1. If there's only one component, set that value as the elbow
    2. Set the minimum to 2, which means the code will run as is and the elbow won't actually be an elbow, but will be somewhere around those two value
    3. Set a minimum closer to 5 where an elbow calculation has a chance of being meaningful

    I originally was leaning towards iii, but I might be overthinking this, and ii might be simplest.

  • Once a decision is made, implement this.

handwerkerd avatar Sep 22 '21 21:09 handwerkerd