How do I specify a catalog if I'm defining mutiple tables using the sqlalchemy ORM style?
I have two tables I want to query from, the fully qualified names are foo_catalog.foo_schema.foo and bar_catalog.bar_schema.bar. I can specify a schema and a tablename in the table definition but I only get a chance to specify the catalog once in the connection.
How do I specify a catalog using declarative base style?
If I was writing raw sql I could query accross both catalogs no problem.
class Base(declarative_base(metaclass=DeclarativeABCMeta)):
__abstract__ = True
class Foo(Base):
__tablename__ = "foo"
__table_args__ = { "schema": "foo_schema"}
id: Mapped[int] = mapped_column(name="id", type_=INTEGER, primary_key=True)
class Bar(Base):
__tablename__ = "bar"
__table_args__ = { "schema": "bar_schema"}
id: Mapped[int] = mapped_column(name="id", type_=INTEGER, primary_key=True)
What happens when you attempt to query with those definitions? It should "just work". If you needed to join data across catalogs that could lead to some issues — although I'm still curious what happens when you try it.
I get the following error if I try to query a Table who'se catalog doesn't match. E.g using foo_catalog connection to look for bar I get
sqlalchemy.exc.DatabaseError: (databricks.sql.exc.ServerOperationError) [TABLE_OR_VIEW_NOT_FOUND] The table or view bar_schema.bar cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
Interesting! Does the same thing happen if you try query the table directly using the SQLAlchemy engine versus using ORM? Your answer to this makes it easier to determine where a fix would need to be applied.
What would the syntax be for querying the table directly using the SQLAlchemy engine? I wnat to make sure I'm testing the right thing :)
with engine.begin() as conn:
cursor = conn.execute("SELECT field FROM catalog1.schema1.table1 LEFT JOIN catalog2.schema2.table2 ON ...")
result = cursor.fetchall()
That works with no error (probably because the sql text has a fully qualified table name that includes catalog and schema)