Scott Donnelly

Results 20 issues of Scott Donnelly

* feat: PBC now expects the period as an argument when querying using PBC functions, rather than storing it in the tree. This prevents the memory layout of the KdTree...

This is a bit of an experiment to see how things could look if we tried to: - process Manifest Lists and Manifest Files concurrently rather than sequentially - process...

`InclusiveMetricsEvaluator` is used inside table scans to filter `DataFile` entries within a Manifest, rejecting any of them if their metrics indicate that they cannot contain any rows that match the...

A few of the source files within iceberg-rust are getting very large (especially `manifest.rs`, `manifest_list.rs`, `schema.rs`, and `table_metadata.rs`). Maybe it's just my taste but I find it painful to navigate...

enhancement

This PR adds some performance testing capabilities. It includes the following features: * docker-compose environment that includes containers for Minio, Spark, HAProxy and the Iceberg REST Catalog * Uses HAProxy...

This builds on top of the [concurrent scans PR ](https://github.com/apache/iceberg-rust/pull/373) and so needs to be merged after that. It caches parsed instances of `Manifest` and `ManifestList` objects so that they...

Ensure that errors get propagated back to the caller when encountered during generation of the file plan.

This brings some big performance gains vs the previous sequential batch processing. On my 12-core Ryzen 9 5900X, I see all 12 cores hitting about 50% utilization. Performance on retrieval...

If attempting a table scan where one of the columns in the table is a `Timestamptz` or a `Timestamp`, you will encounter an error like this: ``` called `Result::unwrap()` on...

Fixes: https://github.com/apache/iceberg-rust/issues/532. Timezone needs to be explicitly set to `UTC` to match values written by Iceberg to underlying Parquet files.