polars icon indicating copy to clipboard operation
polars copied to clipboard

a inner join sql return wrong results

Open l1t1 opened this issue 5 months ago • 3 comments

Checks

  • [X] I have checked that this issue has not already been reported.
  • [X] I have confirmed this bug exists on the latest version of Polars.

Reproducible example

df = pl.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "fruits": ["banana", "banana", "apple", "apple", "banana"],
        "B": [5, 4, 3, 2, 1],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
    }
)
context = pl.SQLContext()
context.register("t", df)
context.register("t1", df)
>>> lf = context.execute("select t.A,t.fruits,t1.B,t1.cars from t,t1 where t.A=t1.B")
>>> lf.collect()
shape: (1, 4)
┌─────┬────────┬─────┬────────┐
│ A   ┆ fruits ┆ B   ┆ cars   │
│ --- ┆ ---    ┆ --- ┆ ---    │
│ i64 ┆ str    ┆ i64 ┆ str    │
╞═════╪════════╪═════╪════════╡
│ 3   ┆ apple  ┆ 3   ┆ beetle │
└─────┴────────┴─────┴────────┘
>>> lf = context.execute("select t.A,t.fruits,t1.B,t1.cars from t join t1 on t.A=t1.B")
>>> lf.collect()
shape: (5, 4)
┌─────┬────────┬─────┬────────┐
│ A   ┆ fruits ┆ B   ┆ cars   │
│ --- ┆ ---    ┆ --- ┆ ---    │
│ i64 ┆ str    ┆ i64 ┆ str    │
╞═════╪════════╪═════╪════════╡
│ 5   ┆ banana ┆ 1   ┆ beetle │
│ 4   ┆ apple  ┆ 2   ┆ beetle │
│ 3   ┆ apple  ┆ 3   ┆ beetle │
│ 2   ┆ banana ┆ 4   ┆ audi   │
│ 1   ┆ banana ┆ 5   ┆ beetle │
└─────┴────────┴─────┴────────┘

Log output

No response

Issue description

1st sql should return 5 rows as following , but it only return 1 rows 2nd sql should return every lines has t.A=t1.B,but only 1 row fits that (t.A=t1.B=3)

Expected behavior

they both return 5 rows like this

+----------------------------
|A   | fruits | B   | cars   
|--- | ---    | --- | ---    
|i64 | str    | i64 | str    
¦-----+--------+-----+-------
|1   | banana | 1   | beetle 
|2   | banana | 2   | beetle 
|3   | apple  | 3   | beetle 
|4   | apple  | 4   | audi   
|5   | banana | 5   | beetle 
+----------------------------

Installed versions

--------Version info---------
Polars:               0.20.3
Index type:           UInt32
Platform:             Windows-7-6.1.7601-SP1
Python:               3.8.8 (tags/v3.8.8:024d805, Feb 19 2021, 13:18:16) [MSC v.1928 64 bit (AMD64)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          2.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fsspec:               2021.11.1
gevent:               22.10.2
hvplot:               <not installed>
matplotlib:           3.3.4
numpy:                1.23.4
openpyxl:             3.1.1
pandas:               1.3.2
pyarrow:              6.0.1
pydantic:             1.8.2
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>

l1t1 avatar Jan 11 '24 01:01 l1t1