Paola Pardo issues

Results 53 issues of


                                            Paola Pardo

Add Table properties to Qbeast

## Description Solves issue #42 . The problematic with `saveAsTable()` goes beyond a simple **`override`** method. It requires a lot of reworking in the design and implementation of `QbeastDataSource` and...

enhancement

Compaction

## Description Adds new feature #98 . :raised_hands: Compaction of small files is coming! ## Type of change In this PR, we present a new feature: compaction of small files....

Make files without Metadata readable with Qbeast

To be more compatible with underlying Table Formats and set up an easier conversion to Qbeast, we should be able to process files that do not have any Qbeast Metadata...

enhancement

high

Change Delta Lake, Hadoop and Spark versions

## Description Changes version for Delta, Hadoop, and Spark, and solves compatibility issues. This is for setting up the build for the upcoming changes in #98 and #4. ## Type...

Overhead of qbeast_hash filtering when doing a Sample

**What went wrong?** When a sample() is performed against a qbeast dataset, the qbeast sql extension changes the Sample operation into a Filter to: - Pushdown the filter to the...

enhancement

performance

Add support for compacting files

## What went wrong? Recently, Delta contributors added the functionality to Optimize tables through SQL on the Open Source version. :raised_hands: You can read everything in the issue related: https://github.com/delta-io/delta/commit/e366ccd6179c70dd603c2093a912aacfe719ed00...

enhancement

high

Add demo video on Readme

Now Github allows embedding videos on the README and other documentation pages. I think it's a good opportunity to include here the jupyther notebook demo. :cat:

documentation

Add Convert To Qbeast

The only way of writing in Qbeast Format is to load your data and write it again with Spark Dataframes API. It could be good to have some more easy...

enhancement

high

Add min-max column information

Right now we add block information on different metrics like **cube**, **weight** and **state** onto the delta commit log. ```scala val tags = Map( cubeTag -> cube, weightMinTag -> minWeight.toString,...

enhancement

medium

ColumnsToIndex and/or CubeSize should be optional on Append

When writing data with `qbeast` format, the user needs to specify every time the `columnsToIndex` or `cubeSize`. This is ok if you want to change them, but it shouldn't be...

enhancement

good first issue