dask-sql
dask-sql copied to clipboard
Distributed SQL Engine in Python using Dask
GPU backend dependencies aren't included in the development environment yml files, causing pytests to fail with `--rungpu` out of the box. Datafusion branch: https://github.com/dask-contrib/dask-sql/blob/datafusion-sql-planner/continuous_integration/environment-3.10-dev.yaml It would be useful for there...
**Is your feature request related to a problem? Please describe.** PR https://github.com/apache/arrow-datafusion/pull/2885 adds three new optimizer rules for decorrelating subqueries and translating them into joins. This may result in more...
``` import pandas as pd from dask_sql import Context c = Context() df = pd.DataFrame({"id": [0, 1, 2]}) c.create_table("df", df) # returns a DataFrame c.sql("select * from df") # returns...
**What happened**: Joining tables backed by dask_cudf dataframes with multiple partitions causes the error `AttributeError: 'Int64Index' object has no attribute '_get_attributes_dict'` to be thrown **Minimal Complete Verifiable Example**: ```python import...
I often inherit existing SQL files which contain a series of queries/statements that should be executed one after the other. It's fairly easy to do something like: ``` with open("my_sql.txt")...
**What happened**: - I'm running a query in dask-sql but I'm encountering a KeyError. **What you expected to happen**: - I expected it to return the rank of each row...
**Is your feature request related to a problem? Please describe.** With datafusion we now map `column in (val1,val2,val3)` like operations to `series.isin`. This operator does not support predicate pushdown **Describe...
With packaging updates for different min versions in dask-sql the conda solver fails to solve properly in our dockerfile setup. Switching to mamba seems to resolve this.
The purpose of this issue is to track the work that we need to do in the [DataFusion](https://github.com/apache/arrow-datafusion) project to support moving the dask-sql planner to DataFusion. ## High Priority...
**Is your feature request related to a problem? Please describe.** Currently the `IN` clause does not have supporting logic for `getOperands` in `expression.rs` the Rust pattern matching arm for `Expr::InList`...