ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat: CockroachDB backend

Open IndexSeek opened this issue 2 years ago • 3 comments

Is your feature request related to a problem?

I am attempting to use Ibis with the Postgres backend to connect to a serverless CockroachDB AWS cluster.

I was hoping to use the Postgres backend, but I'll need to use sqlalchemy-cockroachdb package.

Using the Ibis connect method works but fails when evaluating the version.

ibis_con = ibis.connect(
    f"postgres://{os.environ.get('COCKROACHDB_USER')}:{os.environ.get('COCKROACHDB_PASSWORD')}@{os.environ.get('COCKROACHDB_CLUSTER')}/defaultdb"
)

ibis_con.list_tables()

It raises this error:

AssertionError                            Traceback (most recent call last)
[notebook.ipynb] Cell 7 line 1
----> [1](vscode-notebook-cell:/Users/tylerwhite/Documents/GitHub/cockroachdb-trial/notebook.ipynb#X10sZmlsZQ%3D%3D?line=0) ibis_con.list_tables()

File [python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:168], in BaseAlchemyBackend.list_tables(self, like, database)
    167 def list_tables(self, like=None, database=None):
--> 168     tables = self.inspector.get_table_names(schema=database)
    169     views = self.inspector.get_view_names(schema=database)
    170     return self._filter_with_like(tables + views, like)

File [python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:175], in BaseAlchemyBackend.inspector(self)
    172 @property
    173 def inspector(self):
    174     if self._inspector is None:
--> 175         self._inspector = sa.inspect(self.con)
    176     else:
    177         self._inspector.info_cache.clear()

File [python3.10/site-packages/sqlalchemy/inspection.py:145], in inspect(subject, raiseerr)
    143 elif reg is True:
    144     return subject
--> 145 ret = reg(subject)
    146 if ret is not None:
    147     return ret
...
   3327         "Could not determine version from string '%s'" % v
   3328     )
   3329 return tuple([int(x) for x in m.group(1, 2, 3) if x is not None])

AssertionError: Could not determine version from string 'CockroachDB CCL v23.1.11 (x86_64-pc-linux-gnu, built 2023/09/27 01:53:43, go1.19.10)'

Using the sqlalchemy driver works.

sa_con = create_engine(
    f"cockroachdb://{os.environ.get('COCKROACHDB_USER')}:{os.environ.get('COCKROACHDB_PASSWORD')}@{os.environ.get('COCKROACHDB_CLUSTER')}/defaultdb"
).connect()

Describe the solution you'd like

To use Ibis with CockroachDB.

What version of ibis are you running?

ibis-framework==7.0.0 psycopg2==2.9.9 psycopg2-binary==2.9.9 SQLAlchemy==2.0.21 sqlalchemy-cockroachdb==2.0.1

What backend(s) are you using, if any?

Postgres

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

IndexSeek avatar Oct 06 '23 16:10 IndexSeek

@IndexSeek Thanks for the issue!

It's unfortunate this doesn't work out of the box, but glad to see a dialect exists for it!

Do you know if there's a CockroachDB-endorsed docker image we can use for testing?

EDIT: Looks like https://hub.docker.com/r/cockroachdb/cockroach exists

cpcloud avatar Oct 11 '23 09:10 cpcloud

Ok, after the sqlglot refactor, you can connect to a cockroachdb instance without issue.

There's a caveat on displaying output, though, because cockroach adds an explicit index column which breaks our schema handling

gforsyth avatar Jul 31 '24 18:07 gforsyth

Do you know if there's a CockroachDB-endorsed docker image we can use for testing?

EDIT: Looks like https://hub.docker.com/r/cockroachdb/cockroach exists

Sorry for the long delay on this one, @cpcloud! I was successful using that image. Here is a minimal docker-compose that I used for a quick run.

services:
  cockroach:
    image: cockroachdb/cockroach:latest-v23.1
    environment:
      COCKROACH_DATABASE: ibis_testing
    command: start-single-node --insecure
    ports:
      - "8080:8080"
      - "26257:26257"

Thank you for the info as well, @gforsyth. It's very cool that it is working now after the refactor. I didn't realize that about that index column. Thank you for the heads-up; I think I was able to repro what you described.

I found that I needed to do con.raw_sql("SET experimental_enable_temp_tables = 'on'") (I think for my memtable usage).

I used this code to test:

import ibis
from ibis import _

con = ibis.postgres.connect(
    host="localhost", port=26257, user="root", database="ibis_testing"
)

con.raw_sql("SET experimental_enable_temp_tables = 'on'")

t = ibis.range(10).unnest().name("id").as_table().mutate(squared=(_.id**2).cast("int"))

con.create_table("little_test", t, overwrite=True)

This gives me

DatabaseTable: little_test
  id      int16
  squared int64
  rowid   !int64

When I try to evaluate with something like con.table("little_test").to_*(), I get the error you describe.

This fails:

con.table("little_test").execute()

This succeeds:

con.table("little_test").select("id", "squared").execute()

While it seems a little hacky, dropping the "rowid" column seems to do the trick!

IndexSeek avatar Aug 02 '24 00:08 IndexSeek