dask-sql icon indicating copy to clipboard operation
dask-sql copied to clipboard

[BUG][GPU Logic Bug] "SELECT <column> FROM <table>" brings Error

Open qwebug opened this issue 2 years ago • 0 comments

What happened:

Using "SELECT <column> FROM <table>" by JDBC and python brings 4 different results, when using CPU and GPU.

What you expected to happen:

It is the same result, when using CPU and GPU.

Minimal Complete Verifiable Example:

Query by JDBC:

DROP SCHEMA IF EXISTS database0;
CREATE SCHEMA IF NOT EXISTS database0;
USE SCHEMA database0;
CREATE TABLE t1 WITH ( location = '/tmp/t1.csv', format = 'csv', gpu = FALSE );
CREATE TABLE t1_gpu WITH ( location = '/tmp/t1.csv', format = 'csv', gpu = TRUE );

t1.csv:

c0,c1,c2,c3
'', True, CAST((-127) AS TINYINT), 'Q,,p 4 v'

SQL:

SELECT t1.c2 FROM t1;

Result:

  c2  
------
 NULL 
(1 row)

SQL:

SELECT t1_gpu.c2 FROM t1_gpu;

Result:

            c2            
--------------------------
  CAST((-127) AS TINYINT) 
(1 row)

Query by python:

import pandas as pd
import dask.dataframe as dd
from dask_sql import Context

c = Context()

t1 = dd.read_csv('/tmp/t1.csv')
c.create_table('t1', t1, gpu=False)
c.create_table('t1_gpu', t1, gpu=True)

print('CPU Result:')
result1= c.sql("SELECT t1.c2 FROM t1").compute()
print(result1)

print('GPU Result:')
result2= c.sql("SELECT t1_gpu.c2 FROM t1_gpu").compute()
print(result2)

Result:

INFO:numba.cuda.cudadrv.driver:init
CPU Result:
          c2
''  True NaN
GPU Result:
            c2
''  True  <NA>

Anything else we need to know?:

Environment:

qwebug avatar Sep 20 '23 16:09 qwebug