tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

fix: Fix json range query with different number type

Open b41sh opened this issue 4 months ago • 6 comments

This PR fix json range query with different Numerical type caused panic

thread 'query::range_query::range_query_fastfield::tests::json_range_test' (805162) panicked at src/query/range_query/range_query_fastfield.rs:261:64:
called `Option::unwrap()` on a `None` value

b41sh avatar Oct 24 '25 09:10 b41sh

I'd rather have the limitation that both bounds provided by the user have to be of the same type and otherwise return an error

Not sure it makes sense to have a range from i64 to f64

PSeitz avatar Oct 24 '25 09:10 PSeitz

I'd rather have the limitation that both bounds provided by the user have to be of the same type and otherwise return an error

Not sure it makes sense to have a range from i64 to f64

@PSeitz Thanks for your review. Bounds may have different types in many scenarios, such as [-1 to 18446744073709551615], where the types are i64 and u64 respectively. We cannot convert them to a single type unless using i128.

b41sh avatar Oct 24 '25 10:10 b41sh

I'd rather have the limitation that both bounds provided by the user have to be of the same type and otherwise return an error Not sure it makes sense to have a range from i64 to f64

@PSeitz Thanks for your review. Bounds may have different types in many scenarios, such as [-1 to 18446744073709551615], where the types are i64 and u64 respectively. We cannot convert them to a single type unless using i128.

In this case you would need to use f64 (same as the data would be coerced to on the column)

PSeitz avatar Oct 24 '25 13:10 PSeitz

Hi @PSeitz PTAL, I re-implemented this part of the code, converting lower_bound and upper_bound to actual_column_type respectively

b41sh avatar Oct 30 '25 06:10 b41sh

Hi @PSeitz PTAL, I re-implemented this part of the code, converting lower_bound and upper_bound to actual_column_type respectively

I'd rather have the limitation that both bounds provided by the user have to be of the same type and otherwise return an error. This is a limitation of the API design currently, which should be fixed eventually.

PSeitz avatar Oct 30 '25 08:10 PSeitz

Hi @PSeitz PTAL, I re-implemented this part of the code, converting lower_bound and upper_bound to actual_column_type respectively

I'd rather have the limitation that both bounds provided by the user have to be of the same type and otherwise return an error. This is a limitation of the API design currently, which should be fixed eventually.

i64, u64, and f64 are all number types, but they are resolved into different types during query parsing. https://github.com/quickwit-oss/tantivy/blob/main/src/core/json_utils.rs#L296 For example, if the query is json.number:[10.0 TO 10.51], parsing results in lower_bound being u64 and upper_bound being f64. However, the user actually inputs f64-type numbers. This can lead to unexpected panics, so I think this is a bug that needs fixing.

b41sh avatar Oct 30 '25 09:10 b41sh