dione issues

add parquet index

## Summary Re-implement https://github.com/paypal/dione/pull/28: Adding the option to store the index table as Parquet of Avro btree. This is for batch-only use cases, where Avro index performance is inferior to...

shay1bz

Occasional corruption of index

5

Sometimes, when indexing an additional partition on an existing index over an Avro table, some index files become corrupt. Reading the index produces the following error: ``` org.apache.spark.SparkException: Job aborted...

eyala

No need for caching in sorted-iterator

7

## Summary In #71 we changed the block order. while checking the full iteration on Object Store (S3), we saw it is amazingly slow. What happened is that it switched...

uzadude

Reproducing strange bug

3

## Summary playing around with other things I happened to reproduce #67 ## Detailed Description ## How was it tested? added unit test

uzadude

Changing the implementation of the joinWithIndex to use the B-Tree

1

instead of reading the entire file ## Summary resolves #PR/#Issue don't forget: - please state if it is not backward compatible - please add relevant docs to either the function...

benraha

change to writing Avro B-tree blocks in pre-order

5

## Summary Following the discussion in #70, Looks like we are writing the blocks in the Avro B-tree in a "reverse post-order" order. this causes us to have many hops...

uzadude

Spark Optimizer Integration

1

## Summary very initial trial to integrate with Spark's Optimizer. ## Detailed Description inspired by a very nice work that was done in Microsoft's Hyperspace project. this is just a...

uzadude

wuhaifengdhu

dione
dione copied to clipboard

Metadata

add parquet index

Occasional corruption of index

No need for caching in sorted-iterator

Reproducing strange bug

Changing the implementation of the joinWithIndex to use the B-Tree

change to writing Avro B-tree blocks in pre-order

Spark Optimizer Integration

Fix Array type value when fetching from Parquet data

Adding support for CSV files

Support time range query

← Metadata

Owner

Metadata

dione dione copied to clipboard

Metadata

← Metadata

Owner

Metadata

dione
dione copied to clipboard