feat: CockroachDB backend
Is your feature request related to a problem?
I am attempting to use Ibis with the Postgres backend to connect to a serverless CockroachDB AWS cluster.
I was hoping to use the Postgres backend, but I'll need to use sqlalchemy-cockroachdb package.
Using the Ibis connect method works but fails when evaluating the version.
ibis_con = ibis.connect(
f"postgres://{os.environ.get('COCKROACHDB_USER')}:{os.environ.get('COCKROACHDB_PASSWORD')}@{os.environ.get('COCKROACHDB_CLUSTER')}/defaultdb"
)
ibis_con.list_tables()
It raises this error:
AssertionError Traceback (most recent call last)
[notebook.ipynb] Cell 7 line 1
----> [1](vscode-notebook-cell:/Users/tylerwhite/Documents/GitHub/cockroachdb-trial/notebook.ipynb#X10sZmlsZQ%3D%3D?line=0) ibis_con.list_tables()
File [python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:168], in BaseAlchemyBackend.list_tables(self, like, database)
167 def list_tables(self, like=None, database=None):
--> 168 tables = self.inspector.get_table_names(schema=database)
169 views = self.inspector.get_view_names(schema=database)
170 return self._filter_with_like(tables + views, like)
File [python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:175], in BaseAlchemyBackend.inspector(self)
172 @property
173 def inspector(self):
174 if self._inspector is None:
--> 175 self._inspector = sa.inspect(self.con)
176 else:
177 self._inspector.info_cache.clear()
File [python3.10/site-packages/sqlalchemy/inspection.py:145], in inspect(subject, raiseerr)
143 elif reg is True:
144 return subject
--> 145 ret = reg(subject)
146 if ret is not None:
147 return ret
...
3327 "Could not determine version from string '%s'" % v
3328 )
3329 return tuple([int(x) for x in m.group(1, 2, 3) if x is not None])
AssertionError: Could not determine version from string 'CockroachDB CCL v23.1.11 (x86_64-pc-linux-gnu, built 2023/09/27 01:53:43, go1.19.10)'
Using the sqlalchemy driver works.
sa_con = create_engine(
f"cockroachdb://{os.environ.get('COCKROACHDB_USER')}:{os.environ.get('COCKROACHDB_PASSWORD')}@{os.environ.get('COCKROACHDB_CLUSTER')}/defaultdb"
).connect()
Describe the solution you'd like
To use Ibis with CockroachDB.
What version of ibis are you running?
ibis-framework==7.0.0 psycopg2==2.9.9 psycopg2-binary==2.9.9 SQLAlchemy==2.0.21 sqlalchemy-cockroachdb==2.0.1
What backend(s) are you using, if any?
Postgres
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
@IndexSeek Thanks for the issue!
It's unfortunate this doesn't work out of the box, but glad to see a dialect exists for it!
Do you know if there's a CockroachDB-endorsed docker image we can use for testing?
EDIT: Looks like https://hub.docker.com/r/cockroachdb/cockroach exists
Ok, after the sqlglot refactor, you can connect to a cockroachdb instance without issue.
There's a caveat on displaying output, though, because cockroach adds an explicit index column which breaks our schema handling
Do you know if there's a CockroachDB-endorsed docker image we can use for testing?
EDIT: Looks like https://hub.docker.com/r/cockroachdb/cockroach exists
Sorry for the long delay on this one, @cpcloud! I was successful using that image. Here is a minimal docker-compose that I used for a quick run.
services:
cockroach:
image: cockroachdb/cockroach:latest-v23.1
environment:
COCKROACH_DATABASE: ibis_testing
command: start-single-node --insecure
ports:
- "8080:8080"
- "26257:26257"
Thank you for the info as well, @gforsyth. It's very cool that it is working now after the refactor. I didn't realize that about that index column. Thank you for the heads-up; I think I was able to repro what you described.
I found that I needed to do con.raw_sql("SET experimental_enable_temp_tables = 'on'") (I think for my memtable usage).
I used this code to test:
import ibis
from ibis import _
con = ibis.postgres.connect(
host="localhost", port=26257, user="root", database="ibis_testing"
)
con.raw_sql("SET experimental_enable_temp_tables = 'on'")
t = ibis.range(10).unnest().name("id").as_table().mutate(squared=(_.id**2).cast("int"))
con.create_table("little_test", t, overwrite=True)
This gives me
DatabaseTable: little_test
id int16
squared int64
rowid !int64
When I try to evaluate with something like con.table("little_test").to_*(), I get the error you describe.
This fails:
con.table("little_test").execute()
This succeeds:
con.table("little_test").select("id", "squared").execute()
While it seems a little hacky, dropping the "rowid" column seems to do the trick!