dask-sql
dask-sql copied to clipboard
[BUG]] [GPU Error Bug] "SELECT (CASE false WHEN (true) THEN (true) WHEN false THEN false END ), ('<string>'), false FROM <table>" brings Error
What happened:
"SELECT (CASE false WHEN (true) THEN (true) WHEN false THEN false END ), ('<string>'), false FROM <table>" brings error, when using GPU.
However it is able to output result, when using CPU.
What you expected to happen:
It will not bring error, when using GPU.
Minimal Complete Verifiable Example:
import pandas as pd
import dask.dataframe as dd
from dask_sql import Context
c = Context()
df0 = pd.DataFrame({
'c0': [835.0000],
})
t0 = dd.from_pandas(df0, npartitions=1)
c.create_table('t0', t0, gpu=False)
c.create_table('t0_gpu', t0, gpu=True)
print('CPU Result:')
result1= c.sql("SELECT (CASE false WHEN (true) THEN (true) WHEN false THEN false END ), ('A'), false FROM t0").compute()
print(result1)
print('GPU Result:')
result2= c.sql("SELECT (CASE false WHEN (true) THEN (true) WHEN false THEN false END ), ('A'), false FROM t0_gpu").compute()
print(result2)
Result:
INFO:numba.cuda.cudadrv.driver:init
CPU Result:
CASE Boolean(false) WHEN Boolean(true) THEN Boolean(true) WHEN Boolean(false) THEN Boolean(false) END Utf8("A") Boolean(false)
0 False A False
GPU Result:
Traceback (most recent call last):
File "/tmp/bug.py", line 20, in <module>
result2= c.sql("SELECT (CASE false WHEN (true) THEN (true) WHEN false THEN false END ), ('A'), false FROM t0_gpu").compute()
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/context.py", line 513, in sql
return self._compute_table_from_rel(rel, return_futures)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/context.py", line 869, in _compute_table_from_rel
df = dc.assign()
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/datacontainer.py", line 229, in assign
df.columns = self.column_container.columns
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/core.py", line 4887, in __setattr__
object.__setattr__(self, key, value)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/core.py", line 4746, in columns
renamed = _rename_dask(self, columns)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/core.py", line 7084, in _rename_dask
metadata = _rename(names, df._meta)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/core.py", line 7055, in _rename
df.columns = columns
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/core/dataframe.py", line 1094, in __setattr__
super().__setattr__(key, col)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/nvtx/nvtx.py", line 101, in inner
result = func(*args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/core/dataframe.py", line 2473, in columns
raise ValueError(
ValueError: Length mismatch: expected 2 elements, got 3 elements
Anything else we need to know?:
Environment:
- dask-sql version: 2023.6.0
- Python version: Python 3.10.11
- Operating System: Ubuntu22.04
- Install method (conda, pip, source): Docker deploy by https://hub.docker.com/layers/rapidsai/rapidsai-dev/23.06-cuda11.8-devel-ubuntu22.04-py3.10/images/sha256-cfbb61fdf7227b090a435a2e758114f3f1c31872ed8dbd96e5e564bb5fd184a7?context=explore