dask-sql
dask-sql copied to clipboard
[BUG] [Crash Bug] "SELECT (((((NOT t1.c0))AND(('A' LIKE 'B' ESCAPE '/'))))=(t2.c0)) FROM t1, t2" brings Crash
What happened:
"SELECT (((((NOT t1.c0))AND(('A' LIKE 'B' ESCAPE '/'))))=(t2.c0)) FROM t1, t2" brings crash,when using CPU.
However it is able to output result, when using GPU.
What you expected to happen:
It will not bring crash, when using CPU.
Minimal Complete Verifiable Example:
import pandas as pd
import dask.dataframe as dd
from dask_sql import Context
c = Context()
df1 = pd.DataFrame({
'c0': [5055.0],
'c1': [False],
})
t1 = dd.from_pandas(df1, npartitions=1)
c.create_table('t1', t1, gpu=False)
c.create_table('t1_gpu', t1, gpu=True)
df2 = pd.DataFrame({
'c0': [True],
'c1': ["'T'"],
})
t2 = dd.from_pandas(df2, npartitions=1)
c.create_table('t2', t2, gpu=False)
c.create_table('t2_gpu', t2, gpu=True)
print('GPU Result:')
result2= c.sql("SELECT (((((NOT t1_gpu.c0))AND(('A' LIKE 'B' ESCAPE '/'))))=(t2_gpu.c0)) FROM t1_gpu, t2_gpu").compute()
print(result2)
print('CPU Result:')
result1= c.sql("SELECT (((((NOT t1.c0))AND(('A' LIKE 'B' ESCAPE '/'))))=(t2.c0)) FROM t1, t2").compute()
print(result1)
Result:
INFO:numba.cuda.cudadrv.driver:init
GPU Result:
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
NOT t1_gpu.c0 AND Utf8("A") LIKE Utf8("B") CHAR '/' = t2_gpu.c0
0 False
CPU Result:
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
WARNING:datafusion_optimizer.optimizer:Skipping optimizer rule 'simplify_expressions' due to unexpected error: Execution error: LIKE does not support escape_char
Traceback (most recent call last):
File "/tmp/bug.py", line 28, in <module>
result1= c.sql("SELECT (((((NOT t1.c0))AND(('A' LIKE 'B' ESCAPE '/'))))=(t2.c0)) FROM t1, t2").compute()
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/base.py", line 314, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/base.py", line 599, in compute
results = schedule(dsk, keys, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/threaded.py", line 89, in get
results = get_async(
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/local.py", line 511, in get_async
raise_exception(exc, tb)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/local.py", line 319, in reraise
raise exc
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/local.py", line 224, in execute_task
result = _execute_task(task, data)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/optimization.py", line 990, in __call__
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 149, in get
result = _execute_task(task, cache)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
return func(*(_execute_task(a, cache) for a in args))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
return func(*(_execute_task(a, cache) for a in args))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/utils.py", line 73, in apply
return func(*args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/utils.py", line 1105, in __call__
return getattr(__obj, self.method)(*args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/generic.py", line 6240, in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 448, in astype
return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 352, in apply
applied = getattr(b, f)(**kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 526, in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 299, in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 230, in astype_array
values = astype_nansafe(values, dtype, copy=copy)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 95, in astype_nansafe
return dtype.construct_array_type()._from_sequence(arr, dtype=dtype, copy=copy)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/arrays/masked.py", line 132, in _from_sequence
values, mask = cls._coerce_to_array(scalars, dtype=dtype, copy=copy)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/arrays/boolean.py", line 344, in _coerce_to_array
return coerce_to_array(value, copy=copy)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/arrays/boolean.py", line 185, in coerce_to_array
raise TypeError("Need to pass bool-like values")
TypeError: Need to pass bool-like values
Anything else we need to know?:
Environment:
- dask-sql version: 2023.6.0
- Python version: Python 3.10.11
- Operating System: Ubuntu22.04
- Install method (conda, pip, source): Docker deploy by https://hub.docker.com/layers/rapidsai/rapidsai-dev/23.06-cuda11.8-devel-ubuntu22.04-py3.10/images/sha256-cfbb61fdf7227b090a435a2e758114f3f1c31872ed8dbd96e5e564bb5fd184a7?context=explore