Generalize dtype check in `python_to_sql_type`

Open charlesbluca opened this issue 3 years ago • 2 comments

Chatting with @DaceT exposed that python_to_sql_type only has handling for pandas / numpy types, which can cause issues when attempting to read in a table with a cuDF dtype, like so:

import cudf

from dask_sql import Context

c = Context()
cdf = cudf.datasets.timeseries()

c.create_table("tble", cdf)
result = c.sql("""SELECT timestamp FROM cdf""") 
result.compute()

TypeError: unhashable type: 'CategoricalDtype'

A pretty simple workaround for this is generalizing the column dtype check so that instead of checking for a np.dtype, we check if the Python type is "dtype-like".

Interested in if there's a trivial util function like is_dataframe_like for dtypes?

Feb 14 '22 22:02 charlesbluca

Codecov Report

Merging #402 (33a1801) into main (00df420) will increase coverage by 0.14%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #402      +/-   ##
==========================================
+ Coverage   88.94%   89.09%   +0.14%     
==========================================
  Files          68       68              
  Lines        3337     3337              
  Branches      657      657              
==========================================
+ Hits         2968     2973       +5     
+ Misses        297      288       -9     
- Partials       72       76       +4

Impacted Files	Coverage Δ
dask_sql/mappings.py	`100.00% <100.00%> (ø)`
dask_sql/_version.py	`34.00% <0.00%> (+1.44%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 00df420...33a1801. Read the comment docs.

Feb 26 '22 03:02 codecov-commenter

Things are a little more complicated than I initially assumed here - opened #423 to track this issue

Mar 10 '22 20:03 charlesbluca

dask-sql dask-sql copied to clipboard

Generalize dtype check in `python_to_sql_type`

Codecov Report

dask-sql
dask-sql copied to clipboard