dask-sql
dask-sql copied to clipboard
Distributed SQL Engine in Python using Dask
If the first table a user creates is created with `CREATE OR REPLACE` syntax, they get a confusing INFO: ``` CREATE OR REPLACE TABLE stations WITH ( location = '/raid/weather_meta/stations.parquet',...
Chatting with @DaceT exposed that `python_to_sql_type` only has handling for pandas / numpy types, which can cause issues when attempting to read in a table with a cuDF dtype, like...
**Is your feature request related to a problem? Please describe.** It would be nice if Dask-SQL supported tables with categorical columns; currently, when the attempting to query a table with...
Often, we will specify a `LIMIT` on the result of a complex query, which can potentially leave us with a Dask dataframe with several empty partitions (in my case, I...
The following example has a dataframe with an index named "timestamp". If you want to use the column in a WHERE clause in SQL there is an error. ```python import...
## Report incorrect documentation **Location of incorrect documentation** There is an error loading [binder](https://mybinder.org/v2/gh/dask-contrib/dask-sql-binder/main?urlpath=lab) in the link provided in the [Quickstart section](https://github.com/dask-contrib/dask-sql#quickstart) of the readme. **Describe the problems or issues...
**Is your feature request related to a problem? Please describe.** I am interested in building my own custom SqlParser and Plugins, similar to the way the custom [machine learning](https://dask-sql.readthedocs.io/en/latest/pages/machine_learning.html) features...
**Is your feature request related to a problem? Please describe.** We should warn users when training non dask friendly ML models with wrap_predict=False. This is because when `wrap_predict=False` we assume...
Following the Calcite docs, I was confused about how to compute the difference between two timestamps. I found `TIMESTAMPDIFF(timeUnit, datetime, datetime2)`, but that doesn't seem to work: ``` import pandas...
Related to #412 I'm trying to do math on timestamps, but ran into #411 and #412. The Calcite docs suggest ``` UNIX_SECONDS(timestamp) | Returns the number of seconds since 1970-01-01...