clickhouse-sqlalchemy
clickhouse-sqlalchemy copied to clipboard
select() clause with join creates a cross join
Describe the bug Hi, I'm in the process of upgrading our codebase to use SqlAlchemy 1.4. I have a problem with select clause(that SqlAlchemy would like us to move to instead of query) A simple .join statement produces a cross join as it adds the table twice Once for the table in the select and once for the join. I feared that I'm not familiar enough with the select clause but I tested the same code with Postgres and it works fine so I'm a bit puzzled.
To Reproduce The tables:
CREATE TABLE First
(
`first_column` String
)
ENGINE = Memory;`
`
CREATE TABLE Second
(
`second_column` String
)
ENGINE = Memory;`
The select:
stmt = select(
first.c.first_column,
second.c.second_column,
).join(second, first.c.first_column == second.c.second_column)
In Postgres engine(with tables created there as well), this produces:
SELECT "First".first_column,
"Second".second_column
FROM "First"
JOIN "Second" ON "First".first_column = "Second".second_column
Which runs fine. In CH though this produces.
SELECT "First".first_column,
"Second".second_column
FROM "Second", "First"
INNER JOIN "Second" ON "First".first_column = "Second".second_column
that ultimately crashes:
DB::Exception: Ambiguous column 'Second.second_column': While processing SELECT first_column AS First.first_column
, Second.second_column FROM Second CROSS JOIN First INNER JOIN Second ON First.first_column = Second.second_column. Stack trace:
Would love to have some help with this.
Versions clickhouse-driver==0.2.4 clickhouse-sqlalchemy==0.2.1 SQLAlchemy==1.4.39 python 3.10.5 CH version:22.7.1.2484
Hi.
It seems that it's not pointed out in documentation. Use select
from the package. from clickhouse_sqlalchemy import select
Hi, Thanks for the quick response. When using the select from the package I get a different error.
from clickhouse_sqlalchemy import select as ch_select
stmt = ch_select(
first.c.first_column,
second.c.second_column,
).join(second, first.c.first_column == second.c.second_column)
The resulting SQL is a bit different but still has the cross join and gives a different error:
(
SELECT "First".first_column AS first_column
,"Second".second_column AS second_column
FROM "First"
,"Second"
) AS anon_1
INNER JOIN "Second" ON "First".first_column = "Second".second_column
Executable SQL or text() construct expected, got <sqlalchemy.sql.selectable.Join at 0x7ffbe932ffd0; Join object on anon_1(140719925951552) and Second(140719925945744)>.
Could you provide whole snippet for reproducing issue?
Hi, We're using reflection to create the table objects if that matters.
from sqlalchemy import MetaData, create_engine
from clickhouse_sqlalchemy import Table, engines
from clickhouse_sqlalchemy import select as ch_select
engine = create_engine(
f"clickhouse+native://{user}:{password}@{host}:{port}/{default_db}?{args}",
pool_pre_ping=True,
)
metadata = MetaData(bind=engine)
first = Table(
"First",
metadata,
engines.Memory(),
autoload=True,
extend_existing=True,
)
second = Table(
"Second",
metadata ,
engines.Memory(),
autoload=True,
extend_existing=True,
)
stmt = ch_select(
first.c.first_column,
second.c.second_column,
).join(second, first.c.first_column == second.c.second_column)
Hi @xzkostyan , Have you had a chance to look at it? Can you reproduce it on your end? I just want to know if it's a known issue or something wrong that I'm doing Thanks is advance.
Looks like the ".join_from" method produces the correct result:
from clickhouse_sqlalchemy import select as ch_select
ch_select(first.c.first_column, second.c.second_column).join_from(
first, second, first.c.first_column == second.c.second_column
)
Produces:
SELECT "First".first_column,
"Second".second_column
FROM "First" JOIN "Second" ON "First".first_column = "Second".second_column
but it's more cumbuersome. From the SA documentation the join_from(), looks like it's supposed to work the same as a join but let us indicate the left and right side of the JOIN explicitly.