qbeast-spark icon indicating copy to clipboard operation
qbeast-spark copied to clipboard

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!

Results 86 qbeast-spark issues
Sort by recently updated
recently updated
newest added

PR draft for pre-commit hooks. Issue #321

## What went wrong? Qbeast is not able to overwrite an existing delta table. ## How to reproduce? ``` // Create a delta table df.write.format("delta").save(tablePath) // Overwrite it with qbeast...

type: bug

The input `QbeastOptions.apply` gets from spark is a `CaseInsensitiveMap`. Should `QbeastOptions.toMap` return an instance of `CaseInsensitiveMap` as well?

bug

Table Formats encapsulate write actions into an Optimistic Transaction. Various processes could try to commit the info to the Transaction Log, but only one would succeed, making the others retry...

enhancement

Classes and methods, and their corresponding tests, are rendered redundant by algorithm changes such as `domain-driven double pass` and the latest changes introduced by `multi-block files`. For instance, `NormalizedWeight` should...

type: bug

Investigating in the Spark UI with simple queries, we detected that the Metadata time for Qbeast datasource is bigger than expected. Here's a comparison of a small (10 element) dataset...

bug

From v0.6.0 onwards, the structure of the Table is composed by files that contain multiple `blocks`, each of them belonging to the same or different cubes. This is part of...

type: enhancement

### WARNING: _Replication would be removed from 0.6.0 version_ ## Multiblock Format The upcoming release of Qbeast Spark has[ new protocol updates](https://github.com/Qbeast-io/qbeast-spark/blob/main-1.0.0/docs/QbeastFormat1.0.0.md#block-metadata-before-the-version-100). In this modification, we **change the layout of...

type: bug

## What went wrong? When enabling auto indexing, we call `SparkColumnsToIndexSelector` to choose which are the best columns to group the data. This selection is based on statistics and correlations...

type: bug

## What went wrong? If we try to save and empty DataFrame with qbeast format, we throw the following error: ```scala java.lang.RuntimeException: The DataFrame is empty, why are you trying...

type: bug