nom icon indicating copy to clipboard operation
nom copied to clipboard

Parsing of long floating point numbers causes panic

Open MatejKastak opened this issue 2 years ago • 6 comments

Hello, first of all thank you, nom is a great tool!

I have tried to migrate to a new nom version (7.0.0) and encountered some unexpected behavior.

  • parsing of floating point numbers (float, double) panics for long inputs
  • sadly I found no way to handle this error, program is aborted
  • are there any tips to avoid such panics?
  • recent 6.* versions does not panic
  • migrating to the most recent version is not critical to me, but I wanted to ask/note this behavior for others

After a little digging, I think this is caused by switching to minimal-lexical.

the lexical-core crate used for float parsing has now been replaced with minimal-lexical: the new crate is faster to compile, faster to parse, and has no dependencies

Somewhat related to #1384

Prerequisites

  • Rust version : rustc 1.55.0 (c8dfcfe04 2021-09-06) also tried it with nightly
  • nom version : v7.0.0
  • nom compilation features used: default

Test case

mod tests {
    use nom::number::complete::float;
    use nom::IResult;

    fn parser(s: &str) -> IResult<&str, f32> {
        float(s)
    }

    #[test]
    fn float_panic() {
        // 7.0.0 -> panic
        // 6.2.1 -> pass
        assert!(std::matches!(parser("0.00000000000000000087"), Ok(_)));
        assert!(std::matches!(
            parser("0.00000000000000000000000000000000000000000000000000000000000000087"),
            Ok(_)
        ));

    }
}               

nom v7.0.0

---- tests::float_panic stdout ----
thread 'tests::float_panic' panicked at 'attempt to shift left with overflow', /home/anon/.cargo/registry/src/github.com-1ecc6299db9ec823/minimal-lexical-0.1.4/src/lemire.rs:155:5

nom v6.2.1

running 1 test
test tests::float_panic ... ok

MatejKastak avatar Oct 05 '21 11:10 MatejKastak

thanks for the report. Could you test with nom directly from master?

Geal avatar Oct 07 '21 08:10 Geal

Sure.

~/dev/sandbox/nom-float-panic-new master* 21s
λ cargo test
    Updating crates.io index
   Compiling version_check v0.9.3
   Compiling memchr v2.3.4
   Compiling minimal-lexical v0.2.1
   Compiling nom v7.0.0 (/home/anon/dev/sandbox/nom)
   Compiling nom-float-panic-new v0.1.0 (/home/anon/dev/sandbox/nom-float-panic-new)
    Finished test [unoptimized + debuginfo] target(s) in 9.44s
     Running unittests (target/debug/deps/nom_float_panic_new-b07783cdd65cd87f)

running 1 test
test tests::float_panic ... FAILED

failures:

---- tests::float_panic stdout ----
thread 'tests::float_panic' panicked at 'attempt to shift left with overflow', /home/anon/.cargo/registry/src/github.com-1ecc6299db9ec823/minimal-lexical-0.2.1/src/lemire.rs:155:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I did a git clone, current master HEAD is at dfa55914dc9673e3092974e00f8967220fbfaf78, and then specified nom dependency as path

[dependencies]
# nom = "6"
nom = { path = "../nom" }

MatejKastak avatar Oct 07 '21 21:10 MatejKastak

I got the same panic using b"0.00000000000000000018" as input.

nbigaouette avatar Oct 12 '21 21:10 nbigaouette

Here's the workaround I'm using for now. It's a bit of a cudgel, but afaict it gets the old behavior back.

/// This is derived from nom::number::complete::float, which is unfortunately broken in nom 7.0.0.
/// - https://github.com/Geal/nom/issues/1421
/// - https://github.com/Geal/nom/issues/1384
pub fn float<T, E: ParseError<T>>(input: T) -> IResult<T, f32, E>
where
    T: nom::Slice<std::ops::RangeFrom<usize>>
        + nom::Slice<std::ops::RangeTo<usize>>
        + nom::Slice<std::ops::Range<usize>>,
    T: Clone + nom::Offset,
    T: nom::InputIter + nom::InputLength + nom::InputTake,
    <T as nom::InputIter>::Item: nom::AsChar + Copy,
    <T as nom::InputIter>::IterElem: Clone,
    T: nom::InputTakeAtPosition,
    <T as nom::InputTakeAtPosition>::Item: nom::AsChar,
    T: nom::AsBytes,
    T: for<'a> nom::Compare<&'a [u8]>,
    T: nom::ParseTo<f32>,
    T: nom::InputTake + nom::Compare<&'static str>,
{
    let (i, s) = recognize_float(input)?;
    match s.parse_to() {
        Some(f) => (Ok((i, f))),
        None => Err(nom::Err::Error(E::from_error_kind(
            i,
            nom::error::ErrorKind::Float,
        ))),
    }
}

pub fn recognize_float<T, E: ParseError<T>>(input: T) -> IResult<T, T, E>
where
    T: nom::Slice<std::ops::RangeFrom<usize>> + nom::Slice<std::ops::RangeTo<usize>>,
    T: Clone + nom::Offset,
    T: nom::InputIter,
    <T as nom::InputIter>::Item: nom::AsChar,
    T: nom::InputTakeAtPosition,
    <T as nom::InputTakeAtPosition>::Item: nom::AsChar,
    T: nom::InputTake + nom::Compare<&'static str>,
{
    alt((
        nom::number::complete::recognize_float,
        nom::combinator::recognize(nom::bytes::complete::tag("NaN")),
    ))(input)
}

mullr avatar Oct 22 '21 19:10 mullr

This should have been closed by #1455

parasyte avatar Nov 06 '21 17:11 parasyte

Yes. Leaving it open for now, I'd like to get a proper fix in minimal-lexical, because we take a perf hit with that workaround

Geal avatar Nov 06 '21 17:11 Geal