ndarray-stats icon indicating copy to clipboard operation
ndarray-stats copied to clipboard

quantile_mut: fatal runtime error: stack overflow

Open sjackman opened this issue 2 years ago • 5 comments

Description quantile_mut can fail with the error message:

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

Version Information

  • ndarray: 0.15.4
  • ndarray-stats: 0.5.0
  • Rust: 1.58.1

To Reproduce

use ndarray::Array1;
use ndarray_stats::{interpolate::Linear, Quantile1dExt};
use noisy_float::types::{n64, N64};

fn main() {
    {
        let mut array: Array1<N64> = Array1::ones(15300);
        println!("One {}", array.quantile_mut(n64(0.5), &Linear).unwrap());
    }

    {
        let mut array: Array1<N64> = Array1::ones(15600);
        println!("Two {}", array.quantile_mut(n64(0.5), &Linear).unwrap());
    }

    {
        let mut array: Array1<N64> = Array1::ones(100000);
        println!("Three {}", array.quantile_mut(n64(0.5), &Linear).unwrap());
    }
}

Observed behavior

$ cargo run --profile=dev
One 1

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
$ cargo run --profile=release
One 1
Two 1

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

Expected behavior

One 1
Two 1
Three 1

Additional context

  • I'm able to reproduce this issue on both Linux and macOS with the default stack limit of 8 MiB. (ulimit -s reports 8192)
  • The result is non-deterministic. Re-running the executable can succeed sometimes and fail sometimes. The larger the vector the more likely it is to fail.
  • The result depends on whether optimization is enabled.

sjackman avatar Mar 04 '22 21:03 sjackman

Exceeding the stack limit suggests to me either infinite recursion or a large data structure is being placed on the heap. I don't see an obvious source of either of those causes in the code. It's possible that more generalized memory corruption could cause this behaviour. https://github.com/rust-ndarray/ndarray-stats/blob/9ed937302682b1883128d920e0157615b830778b/src/quantile/mod.rs#L437-L498

sjackman avatar Mar 04 '22 21:03 sjackman

https://github.com/rust-ndarray/ndarray-stats/blob/b6628c6a1c143532673213c56d46da5fda80cbe8/src/sort.rs#L293 Looks like this recursive function call _get_many_from_sorted_mut_unchecked is recursing probably not infinitely but once per size of the vector, and blowing through the stack, perhaps because all the elements in the array are equal. (thanks to @evolvedmicrobe for discovering)

sjackman avatar Mar 05 '22 01:03 sjackman

I think this issue may be fixed by #80; I'll take another look at that PR this weekend to see if we can get it merged.

jturner314 avatar Mar 05 '22 02:03 jturner314

I checked, and #80 doesn't fix this issue as-is. It's a bit tricky to avoid stack overflow with the recursive algorithm for large arrays with all equal elements. At first glance, I suspect that the best fix would be for partitioning to handle == pivot and > pivot elements separately, instead of grouping them together. Please feel free to contribute a fix. Otherwise, I'll work on it when I get a chance (which may be a while).

jturner314 avatar Mar 07 '22 02:03 jturner314

Thank you for checking, Jim! I have a work around for now using slice:sort_unstable. I'd like to get to fixing this issue, but it similarly may be a while. I'll post my workaround below, in case anyone else stumbles on this issue.

/// Return the median. Sorts its argument in place.
pub fn median_mut<T>(xs: &mut Array1<T>) -> Result<T, QuantileError>
where
    T: Clone + Copy + Ord + FromPrimitive,
    T: Add<Output = T> + Sub<Output = T> + Mul<Output = T> + Div<Output = T> + Rem<Output = T>,
{
    if false {
        // quantile_mut may fail with the error: fatal runtime error: stack overflow
        // See https://github.com/rust-ndarray/ndarray-stats/issues/86
        xs.quantile_mut(n64(0.5), &Midpoint)
    } else {
        if xs.is_empty() {
            return Err(QuantileError::EmptyInput);
        }
        xs.as_slice_mut().unwrap().sort_unstable();
        Ok(if xs.len() % 2 == 0 {
            (xs[xs.len() / 2] + xs[xs.len() / 2 - 1]) / (T::from_u64(2).unwrap())
        } else {
            xs[xs.len() / 2]
        })
    }
}

sjackman avatar Mar 07 '22 17:03 sjackman