delta
delta copied to clipboard
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
## Description When comparing DeltaScan filters, un-resolves and re-resolves nested field extractions from the source expression set to the target expression set. This is to get around the fact the...
## OPTIMIZE jobs aren't fully parallelized until the end of the execution ### When running the OPTIMIZE command I noticed that the execution starts with a number of jobs equal...
Signed-off-by: Denis Krivenko ## Description This PR adds variables substitution to Delta SQL Parser. Resolves https://github.com/delta-io/delta/issues/1267 issue. ## How was this patch tested? DeltaSqlParserSuite was changed to test this PR....
## What changes were proposed in this pull request? - Allow multiple benchmarks to be run in a sequence, just give the names in a comma-separated sequence - Allow the...
Checkpoint documentation in protocol specification is ambiguous and could be more concrete. [Requirements for writers](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#checkpoint-format) says that each row in checkpoint is an action. Later, [checkpoint schema](https://github.com/delta-io/delta/blob/master/PROTOCOL.md#checkpoint-schema) indicates that every...
## Feature request ### Overview Current VACUUM implementation sometimes is very inefficient / slow because of few reasons: * First phase of vacuum lists all files. It is done in...
- Python v3.7.5 - Pyspark v3.1.2 - delta-spark v1.0.0 Facing an error when using subqueries in where predicate while deleting. This code works fine on databricks but when running it...
## Feature request ### Overview The codebase for delta shows 2 isolation levels: Serializable and WriteSerializable However in OSS Delta the default isolation level is Serializable, and this cannot be...
## Bug ### Describe the problem Trying to get lastCommitVersionInSession in a thread safe way. So cloned a new session in thread. But it only works in databricks, not with...
## Bug ### Describe the problem If you have a generated column that uses a nested field, the entire top level struct can no longer evolve the schema, even though...