Paola Pardo

Results 73 comments of Paola Pardo

Merged on https://github.com/Qbeast-io/qbeast-spark/pull/284

I have a question regarding this task: If we filter all the files with Delta, does still make sense to filter again with Qbeast to filter by min/max? For the...

And can you @alexeiakimov take care of this task? Thank you!

After discussion, agreed on: - When applying WHERE file filtering, let min/max Delta Skipping filter the set of files. - When applying SAMPLING, join both sets of files. This allows...

Wuuuu, we really need to work on this Revision flow.......... opening an issue for redefining the steps.

I cannot reproduce the error that you are experiencing. I've tried: - Version `1.0.0-SNAPSHOT `working with Spark 3.4.1 and Delta 2.4.0. Fine - Version `1.0.0-6a780ea1-SNAPSHOT` working with Spark 3.5.0 and...

1. The number of columns used to compute the stats **can be set with a table property from Delta**: `delta.dataSkippingNumIndexedCols`. Since it's a table property, you should create the table...

My initial thoughts on this: 1. IdentityTransformation should NOT be superseded by another IdentityTransformation. (By definition, the space value of Identity A is not considered in Identity B unless value...

In DatasourceV2 there's also the possibility to **build your own scan of the table**, with more options than the Datasource V1 (which we are currently using). Maybe it's worth to...