Paola Pardo issues

Results 53 issues of


                                            Paola Pardo

trafficstars

Issue #294: Optimization of Unindexed Files [Staging Area]

## Description Adds #294 ## Type of change New Feature. The Unindexed Files of a Qbeast Table were only optimizable from the `StagingDataManager` component. After thinking about structure and use...

Remove Staging Data Manager if it is proven ineffective.

After deprecating the Staging Data Manager in #438 , iff we don't see its effectiveness in any workflow during 1/2 releases, we should delete the component.

type: enhancement

Unify Create Qbeast Table in one step

Right now, the code to create a Qbeast Table without data (no SAVE AS), is done in two steps: 1. Create an empty data frame and save it in the...

type: enhancement

Add QbeastTable.forTable method

We are only able to load the QbeastTable from Path now. ```scala import io.qbeast.spark.QbeastTable val qbeastTable = QbeastTable.forPath(spark, "/path") ``` we should be able to load it also for table...

type: enhancement

Deprecate Staging Area Manager

In the current code base, we use a `StagingDataManager` component to write new data while efficiently indexing in batch. This `StagingDataManager` was **added to avoid creating many small files when...

type: enhancement

Error when indexing a table with BIGINT

## What went wrong? When creating a table using BIGINT on a date column and inserting a set of 10 rows, a `ScalaMatch` error appears. We would need to investigate...

type: bug

Index Without Rewriting

There's two options right now to convert a Delta or Parquet table: 1. Use Convert To Qbeast Command, which does a single commit to the Delta Log with Qbeast empty...

type: enhancement

Move CDFQuantiles stats computation to the Data Analyzer

Following with #416, we should add the CDF Quantiles computation to the Data Analyzer instead of computing it on the external API. Right now, we are using the `QbeastUtils` interface...

type: enhancement

Add Metastores Documentation

We should add documentation for getting started with different Metastore/Catalog solutions, so its simpler to follow the guides. Let's start with Hive and Glue as the most commonly used.

type: documentation

Unify Table Properties structure and storage location

When saving the data as a table (either with CREATE TABLE SQL statement or a saveAsTable method) we save `columnsToIndex`, `cubeSize` and `columnStats` in the Metastore. This metadata might not...

type: enhancement