dask-sql icon indicating copy to clipboard operation
dask-sql copied to clipboard

[BUG][GPU Logic Bug] "SELECT (<string>)||(<column(decimal)>) FROM <table>" brings Error

Open qwebug opened this issue 2 years ago • 4 comments

What happened:

"SELECT (<string>)||(<column(decimal)>) FROM <table>" brings different results, when using CPU and GPU.

What you expected to happen:

It is the same result, when using CPU and GPU.

Minimal Complete Verifiable Example:

import pandas as pd
import dask.dataframe as dd
from dask_sql import Context

c = Context()

df = pd.DataFrame({
    'c0': [0.5113391810437729]
})
t1 = dd.from_pandas(df, npartitions=1)

c.create_table('t1', t1, gpu=False)
c.create_table('t1_gpu', t1, gpu=True)

print('CPU Result:')
result1= c.sql("SELECT ('A')||(t1.c0) FROM t1").compute()
print(result1)

print('GPU Result:')
result2= c.sql("SELECT ('A')||(t1_gpu.c0) FROM t1_gpu").compute()
print(result2)

Result:

CPU Result:
    Utf8("A") || t1.c0
0  A0.5113391810437729
GPU Result:
  Utf8("A") || t1_gpu.c0
0           A0.511339181

Anything else we need to know?:

Environment:

qwebug avatar Sep 19 '23 14:09 qwebug

Trying out your reproducer with latest main gives me an error 😕 looks like at some point between now and 2023.6.0 our logical plan has changed such that we skip the casting of the non-string column:

# 2023.6.0
Projection: Utf8("A") || CAST(t1.c0 AS Utf8)
  TableScan: t1 projection=[c0]

# main
Projection: Utf8("A") || t1.c0
  TableScan: t1 projection=[c0]

Leading to errors in the binary operation; cc @jdye64 if you have any capacity to look into this. As for the original issue, it seems like that generally comes down to difference in the behavior of cast operations on CPU/GPU, as the following shows the same issue:

print('CPU Result:')
result1= c.sql("SELECT CAST(c0 AS STRING) FROM t1").compute()
print(result1)

print('GPU Result:')
result2= c.sql("SELECT CAST(c0 AS STRING) FROM t1_gpu").compute()
print(result2)

Can look into that, would you mind modifying your issue description / title to reflect this?

charlesbluca avatar Oct 25 '23 17:10 charlesbluca

Thanks for your confirmation. We look forward to your replies about bug fixes.

qwebug avatar Nov 29 '23 15:11 qwebug

This problem came up at dask-sql version: 2023.6.0 . And it has been fixed at dask-sql version: 2024.3.0, after my verification. Thanks to the developers for their contributions.

qwebug avatar Jun 05 '24 20:06 qwebug