Adhish Singla
Adhish Singla
The size of ObjectMeta that we are setting using the content_length seems to be wrong. From the above error log "slice index starts at 18446744073707112300 but ends at 14779976438" ,...
I get the same error locally. Also content_length is 14779976446. Still trying to fully reproduce it, it takes around 20-25 mins to run once.
Size of the file is approx 15Gb, so all that adds up wrt size. looking at how that underflow is generated and being used now
~~bug is in datafusion, where it tries to infer the schema and panics when it fails to slice it with that underflow.~~
The bug is in here : https://github.com/apache/arrow-rs/blob/f0455d12ddcb174f1f8d2bbfd5874f7b708c9a74/object_store/src/lib.rs#L780C5-L809 instead of returning the correct range, it returns the whole object. So we will need to push the fix upstream for this one.
> @adhish20 @scsmithr FYI, there were some changes upstream in object_store that makes this specific query flaky. > > `https://datasets.clickhouse.com/hits_compatible/hits.parquet` sometimes fulfills a `range` request, other times it does not....
@universalmind303 Should we mark this as closed then?