nom
nom copied to clipboard
Parsing of long floating point numbers causes panic
Hello, first of all thank you, nom is a great tool!
I have tried to migrate to a new nom version (7.0.0) and encountered some unexpected behavior.
- parsing of floating point numbers (
float
,double
) panics for long inputs - sadly I found no way to handle this error, program is aborted
- are there any tips to avoid such panics?
- recent
6.*
versions does not panic - migrating to the most recent version is not critical to me, but I wanted to ask/note this behavior for others
After a little digging, I think this is caused by switching to minimal-lexical
.
the lexical-core crate used for float parsing has now been replaced with minimal-lexical: the new crate is faster to compile, faster to parse, and has no dependencies
Somewhat related to #1384
Prerequisites
- Rust version :
rustc 1.55.0 (c8dfcfe04 2021-09-06)
also tried it with nightly - nom version :
v7.0.0
- nom compilation features used: default
Test case
mod tests {
use nom::number::complete::float;
use nom::IResult;
fn parser(s: &str) -> IResult<&str, f32> {
float(s)
}
#[test]
fn float_panic() {
// 7.0.0 -> panic
// 6.2.1 -> pass
assert!(std::matches!(parser("0.00000000000000000087"), Ok(_)));
assert!(std::matches!(
parser("0.00000000000000000000000000000000000000000000000000000000000000087"),
Ok(_)
));
}
}
nom v7.0.0
---- tests::float_panic stdout ----
thread 'tests::float_panic' panicked at 'attempt to shift left with overflow', /home/anon/.cargo/registry/src/github.com-1ecc6299db9ec823/minimal-lexical-0.1.4/src/lemire.rs:155:5
nom v6.2.1
running 1 test
test tests::float_panic ... ok
thanks for the report. Could you test with nom directly from master?
Sure.
~/dev/sandbox/nom-float-panic-new master* 21s
λ cargo test
Updating crates.io index
Compiling version_check v0.9.3
Compiling memchr v2.3.4
Compiling minimal-lexical v0.2.1
Compiling nom v7.0.0 (/home/anon/dev/sandbox/nom)
Compiling nom-float-panic-new v0.1.0 (/home/anon/dev/sandbox/nom-float-panic-new)
Finished test [unoptimized + debuginfo] target(s) in 9.44s
Running unittests (target/debug/deps/nom_float_panic_new-b07783cdd65cd87f)
running 1 test
test tests::float_panic ... FAILED
failures:
---- tests::float_panic stdout ----
thread 'tests::float_panic' panicked at 'attempt to shift left with overflow', /home/anon/.cargo/registry/src/github.com-1ecc6299db9ec823/minimal-lexical-0.2.1/src/lemire.rs:155:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
I did a git clone, current master HEAD is at dfa55914dc9673e3092974e00f8967220fbfaf78, and then specified nom dependency as path
[dependencies]
# nom = "6"
nom = { path = "../nom" }
I got the same panic using b"0.00000000000000000018"
as input.
Here's the workaround I'm using for now. It's a bit of a cudgel, but afaict it gets the old behavior back.
/// This is derived from nom::number::complete::float, which is unfortunately broken in nom 7.0.0.
/// - https://github.com/Geal/nom/issues/1421
/// - https://github.com/Geal/nom/issues/1384
pub fn float<T, E: ParseError<T>>(input: T) -> IResult<T, f32, E>
where
T: nom::Slice<std::ops::RangeFrom<usize>>
+ nom::Slice<std::ops::RangeTo<usize>>
+ nom::Slice<std::ops::Range<usize>>,
T: Clone + nom::Offset,
T: nom::InputIter + nom::InputLength + nom::InputTake,
<T as nom::InputIter>::Item: nom::AsChar + Copy,
<T as nom::InputIter>::IterElem: Clone,
T: nom::InputTakeAtPosition,
<T as nom::InputTakeAtPosition>::Item: nom::AsChar,
T: nom::AsBytes,
T: for<'a> nom::Compare<&'a [u8]>,
T: nom::ParseTo<f32>,
T: nom::InputTake + nom::Compare<&'static str>,
{
let (i, s) = recognize_float(input)?;
match s.parse_to() {
Some(f) => (Ok((i, f))),
None => Err(nom::Err::Error(E::from_error_kind(
i,
nom::error::ErrorKind::Float,
))),
}
}
pub fn recognize_float<T, E: ParseError<T>>(input: T) -> IResult<T, T, E>
where
T: nom::Slice<std::ops::RangeFrom<usize>> + nom::Slice<std::ops::RangeTo<usize>>,
T: Clone + nom::Offset,
T: nom::InputIter,
<T as nom::InputIter>::Item: nom::AsChar,
T: nom::InputTakeAtPosition,
<T as nom::InputTakeAtPosition>::Item: nom::AsChar,
T: nom::InputTake + nom::Compare<&'static str>,
{
alt((
nom::number::complete::recognize_float,
nom::combinator::recognize(nom::bytes::complete::tag("NaN")),
))(input)
}
This should have been closed by #1455
Yes. Leaving it open for now, I'd like to get a proper fix in minimal-lexical, because we take a perf hit with that workaround