dask-sql
dask-sql copied to clipboard
[BUG] [Logic Bug] "SELECT <column> FROM <table>" by JDBC brings Error
What happened:
Using "SELECT <column> FROM <table>" by JDBC brings different results, when using CPU and GPU.
However, it is the same result, when using this sql by python.
What you expected to happen:
It is the same result, when using CPU and GPU.
Minimal Complete Verifiable Example:
Query by JDBC:
DROP SCHEMA IF EXISTS database0;
CREATE SCHEMA IF NOT EXISTS database0;
USE SCHEMA database0;
CREATE TABLE t0 WITH ( location = '/tmp/t0.csv', format = 'csv', gpu = FALSE );
CREATE TABLE t0_gpu WITH ( location = '/tmp/t0.csv', format = 'csv', gpu = TRUE );
t0.csv:
c0,c1,c2
',?', 'E', True
SQL:
SELECT t0.c2 FROM t0;
Result:
c2
-------
True
(1 row)
SQL:
SELECT t0_gpu.c2 FROM t0_gpu;
Result:
c2
------
'E'
(1 row)
Query by python:
import pandas as pd
import dask.dataframe as dd
from dask_sql import Context
c = Context()
df = pd.DataFrame({
'c0': [',?'],
'c1': ['E'],
'c2': [True],
})
t0 = dd.from_pandas(df, npartitions=1)
c.create_table('t0', t0, gpu=False)
c.create_table('t0_gpu', t0, gpu=True)
print('CPU Result:')
result1= c.sql("SELECT t0.c2 FROM t0").compute()
print(result1)
print('GPU Result:')
result2= c.sql("SELECT t0_gpu.c2 FROM t0_gpu").compute()
print(result2)
Result:
INFO:numba.cuda.cudadrv.driver:init
CPU Result:
c2
0 True
GPU Result:
c2
0 True
Anything else we need to know?:
Environment:
- dask-sql version: 2023.6.0
- Python version: Python 3.10.11
- Operating System: Ubuntu22.04
- Install method (conda, pip, source): Docker deploy by https://hub.docker.com/layers/rapidsai/rapidsai-dev/23.06-cuda11.8-devel-ubuntu22.04-py3.10/images/sha256-cfbb61fdf7227b090a435a2e758114f3f1c31872ed8dbd96e5e564bb5fd184a7?context=explore
- JDBC Version: presto-jdbc 0.283