dione icon indicating copy to clipboard operation
dione copied to clipboard

Dione - a Spark and HDFS indexing library

Results 24 dione issues
Sort by recently updated
recently updated
newest added

## Summary Re-implement https://github.com/paypal/dione/pull/28: Adding the option to store the index table as Parquet of Avro btree. This is for batch-only use cases, where Avro index performance is inferior to...

Sometimes, when indexing an additional partition on an existing index over an Avro table, some index files become corrupt. Reading the index produces the following error: ``` org.apache.spark.SparkException: Job aborted...

## Summary In #71 we changed the block order. while checking the full iteration on Object Store (S3), we saw it is amazingly slow. What happened is that it switched...

## Summary playing around with other things I happened to reproduce #67 ## Detailed Description ## How was it tested? added unit test

instead of reading the entire file ## Summary resolves #PR/#Issue don't forget: - please state if it is not backward compatible - please add relevant docs to either the function...

## Summary Following the discussion in #70, Looks like we are writing the blocks in the Avro B-tree in a "reverse post-order" order. this causes us to have many hops...

## Summary very initial trial to integrate with Spark's Optimizer. ## Detailed Description inspired by a very nice work that was done in Microsoft's Hyperspace project. this is just a...

# Summary We currently fail to fetch a value of type Array if the data is in Parquet format. need to look into this. Also, `loadByIndex` fail if requested columns...

## Summary Adding support for indexing CSV files. ## How was it tested? added unit tests

## Summary Support time range query Links to #PR/#Issue ## Detailed Description what is the problem? how can we solve it?