piccolo icon indicating copy to clipboard operation
piccolo copied to clipboard

First class Cockroach DB support.

Open gnat opened this issue 2 years ago • 72 comments

Friction-free Cockroach DB support in Piccolo ORM.

PR: https://github.com/piccolo-orm/piccolo/pull/608

We'd like to see Cockroach Labs promote Piccolo as one of, if not the recommended async ORM. asyncpg has some support built-in to help us get there.

In the mean time I will be adding notes on how to use CockroachDB with Piccolo, and general progress.

Quick local setup for testing

  • Download/extract (single binary): https://www.cockroachlabs.com/docs/stable/install-cockroachdb-linux.html
  • Start Cockroach: ./cockroach start-single-node --insecure --store=node1 --listen-addr=localhost:26257 --http-addr=localhost:8080 --background
  • Enter SQL client: ./cockroach sql --insecure
    • Can create database ...; use ...; from here.
  • CTRL-C or killall cockroach to send graceful termination signal.

Default account is root without password. Fancy stats dashboard at http://localhost:8080

Both Piccolo Examples and Piccolo Admin are working.

image

What we still need

  • [x] ~~Get build/test suite running on Github Actions for Piccolo QA tooling.~~
  • [x] ~~Some way to identify we're running under Cockroach DB.~~
    • ~~Added CockroachDB Engine.~~
  • [x] ~~Get test suite mostly / fully passing.~~
  • [x] ~~Basic documentation~~
  • [x] ~~Time Travel queries using AS OF SYSTEM TIME~~
    • ~~Some way to do Time Travel SELECT queries (SELECT ... AS OF SYSTEM TIME)~~
    • ~~https://www.cockroachlabs.com/docs/stable/as-of-system-time.html~~
    • :heavy_check_mark: See: https://github.com/piccolo-orm/piccolo/issues/607#issuecomment-1251330590
  • [x] Sharding for Sequential Indexes
    • :warning: Done, but will go into its own PR after this one: https://github.com/piccolo-orm/piccolo/issues/622
    • Some way to do "Hash Sharded" Indexes (INDEX ... USING HASH)
    • https://www.cockroachlabs.com/docs/stable/hash-sharded-indexes.html

:bug: Known issues in CockroachDB or Asyncpg

[ ] ARRAY support between asyncpg ↔️ cockroach.

  • Currently affected by: https://github.com/cockroachdb/cockroach/issues/71908
  • See: piccolo/tests/columns/test_array.py
  • M2M works, but as_list=True does not because it uses ARRAY

cannot drop UNIQUE constraint "..." using ALTER TABLE DROP CONSTRAINT, use DROP INDEX CASCADE instead

  • Tracked: https://github.com/cockroachdb/cockroach/issues/42840
  • Affecting test_alter_column_unique() in test_migrations.py

column ... is of type int[] and thus is not indexable / column ... is of type varchar[] and thus is not indexable

  • Tracked: https://github.com/cockroachdb/cockroach/issues/35730
  • Affecting both test_array_column_integer() and test_array_column_varchar in test_migration_manager.py

:moon: Future nice to haves

CockroachDB executes ALTER COLUMN queries asynchronously.

  • In the short term this can be mitigated by keeping migrations small.
  • At odds with Piccolo's assumption that the database is altered before the next migration operation begins. CockroachDB will give a warning: table <...> is currently undergoing a schema change if a later operation tries to modify the table before the asynchronous query finishes. CockroachDB may fix this in the future.

gnat avatar Aug 30 '22 15:08 gnat

:heavy_check_mark: Historical Issues / Solutions of Note

:heavy_check_mark: All issues are resolved in https://github.com/piccolo-orm/piccolo/pull/608 but are here if you need to apply them manually.

:heavy_check_mark: :bug: expected DEFAULT expression to have type timestamp, but 'current_timestamp() + '01:00:00'' has type timestamptz

https://github.com/piccolo-orm/piccolo/blob/ed93feb85050895cc54e7db9650eddc69ab94fa0/piccolo/columns/defaults/timestamp.py#L24

return f"CURRENT_TIMESTAMP::timestamp + INTERVAL '{interval_string}'"

https://github.com/piccolo-orm/piccolo/blob/ed93feb85050895cc54e7db9650eddc69ab94fa0/piccolo/columns/defaults/timestamp.py#L45

return "current_timestamp::timestamp"

:heavy_check_mark: :bug: asyncpg.exceptions.FeatureNotSupportedError: unimplemented: extension "uuid-ossp" is not yet supported

https://github.com/piccolo-orm/piccolo/blob/d01066002e9932ff7b7ea7e2e678b8a54faf666c/piccolo/engine/postgres.py#L248

#extensions = ["uuid-ossp"]

Or add extensions=[] in piccolo_config.py when creating DB = PostgresEngine(...

:heavy_check_mark: :bug: ALTER COLUMN TYPE is not supported inside a transaction

SET CLUSTER SETTING sql.defaults.experimental_alter_column_type.enabled = true;

Same strategy employed by django-cockroach

  • Affecting test_alter_column_digits() in test_migration_manager.py
  • Tracked: https://github.com/cockroachdb/cockroach/issues/49351?version=v22.1
  • Related: https://github.com/cockroachdb/cockroach/issues/49329

gnat avatar Aug 30 '22 17:08 gnat

No worries, we'll use this issue instead as it's more detailed.

dantownsend avatar Aug 30 '22 18:08 dantownsend

I experimented a bit and researched about this.

Tests / CI

  • Add Cockroach to the Piccolo QA tooling.

For Github actions only found an example of how Peewee ORM does it (basically downloading CockroachDB, starting with start-single-node and creating a test database using CockroachDB sql - same as a local installation but I don't know how it works in practice). Peewee also has special field-types for CockroachDB specifics (primary key and uuid).

I was able to run the tests locally using PostgresEngine and created a few abstract methods e.g

# in columns/base.py
class Default(ABC):
    @abstractproperty
    def postgres(self) -> str:
        pass

    @abstractproperty
    def sqlite(self) -> str:
        pass

    @abstractproperty
    def cockroach(self) -> str:
        pass

# and then in columns/timestamp.py
class TimestampNow(Default):
    @property
    def postgres(self):
        return "current_timestamp"

    @property
    def sqlite(self):
        return "current_timestamp"

    @property
    def cockroach(self):
        return "current_timestamp::timestamp"

for the CockroachDB specifics (because without that Postgres tests are failing). The result is that a lot of the tests failed (DuplicateTableError, InterfaceError etc.) and there is also a few errors. Also all tests that return a serial ID in the response fail because CockroachDB has a different primary key implementation.

test_result

Tests run 10 times slower than postgres (even with start-single-node) or I don't know how to fine tune it (I use this command to start single node cockroach start-single-node --insecure --store=node1 --store=type=mem,size=0.5 --listen-addr=localhost:26257 on my local machine (4 cores - 4GB RAM + 2GB swap)). After the tests were completed, I had to use DROP DATABASE piccolo CASCADE and again CREATE DATABASE piccolo because the database is not cleaned after the test suite is finished?

Sorry for the long comment, I hope this is useful.

sinisaos avatar Aug 31 '22 14:08 sinisaos

For reference, in my experience, django-cockroach is also a fairly fleshed out postresql :arrow_right: cockroachdb layer, and may also serve as a good reference point for common solutions.

Peewee also has special field-types for CockroachDB specifics (primary key and uuid)

Yeah I do believe this is to set the correct default value (Ex: DEFAULT unique_rowid() and DEFAULT gen_random_uuid()). Here it is in django-cockroach: https://github.com/cockroachdb/django-cockroachdb/blob/0d44aaed3b414e263ee9f9ffb7ed71f585fdb9a1/django_cockroachdb/base.py#L34

gnat avatar Aug 31 '22 15:08 gnat

I think we could do something like this, but Daniel would know better. First create new CockroachEngine engine

from __future__ import annotations
import typing as t
from piccolo.engine.postgres import PostgresEngine

class CockroachEngine(PostgresEngine):
    def __init__(
        self,
        config: t.Dict[str, t.Any],
        extensions: t.Sequence[str] = (),
    ) -> None:
        self.config = config
        self.extensions = extensions
        super().__init__(config=config, extensions=extensions)
       
        engine_type = "cockroach"
        ...

and then change Serial column to this

class Serial(Column):
    """
    An alias to an autoincrementing integer column in Postgres.
    """

    @property
    def column_type(self):
        engine_type = self._meta.engine_type
        if engine_type == "postgres":
            return "SERIAL"
        elif engine_type == "sqlite":
            return "INTEGER"
        elif engine_type == "cockroach":
            return "DEFAULT unique_rowid()"
        raise Exception("Unrecognized engine type")

And then do the same for UUID column.

I have no experience with CockroachDB, but do you know why the tests take so long? I think this is a problem for GH actions.

sinisaos avatar Aug 31 '22 16:08 sinisaos

@sinisaos Yeah, something like that will work.

This needs changing too:

https://github.com/piccolo-orm/piccolo/blob/ed93feb85050895cc54e7db9650eddc69ab94fa0/piccolo/query/base.py#L272-L294

So the logic for cockroach is:

  • If cockroach query exists, then use it
  • Otherwise, if postgres query exists, then use it
  • Otherwise, use default query

dantownsend avatar Aug 31 '22 16:08 dantownsend

--store=type=mem,size=0.5

Would bump that to at least 2GB. CockroachDB tends to choke up on anything less.

Tests are running fairly quickly here, but I'm currently hanging on tests/table/instance/test_save.py

gnat avatar Aug 31 '22 17:08 gnat

On the bright side the tests are running here in 2:13

image Currently running into this issue of Cockroach sending back cannot perform operation: another operation is in progress. Think this may be because Cockroach is performing operations asynchronously and we aren't retrying.

EDIT: Actually, probably running into the "multiple active portals" issue: https://github.com/MagicStack/asyncpg/issues/738 https://github.com/MagicStack/asyncpg/issues/580 https://github.com/cockroachdb/cockroach/pull/83164

gnat avatar Aug 31 '22 17:08 gnat

@dantownsend You're right. I just gave an example of what we could do. Maybe you should install CockroachDB when you have time and try to run the tests. @gnat You're right. If I run with just cockroach start-single-node --insecure --store=node1 result is much better - 8 min but that's probably because of my old and slow laptop :)

sinisaos avatar Aug 31 '22 17:08 sinisaos

I currently "work around" the cannot perform operation: another operation is in progress in my own internal ORM by replacing fetchval with fetch-- but I do not see fetchval being used in Piccolo.. so it's something else triggering the "multiple active portals" issue..

gnat avatar Aug 31 '22 17:08 gnat

@dantownsend any ideas?

gnat avatar Aug 31 '22 17:08 gnat

@gnat I browsed some issues on cockroachdb (like this https://github.com/cockroachdb/cockroach/issues/40195). I don't have a definite answer, just some ideas:

  • Could be something to do with transactions.
  • Or maybe Piccolo isn't closing a connection somewhere.
  • A bug with asyncpg?

I need to play around with it a bit more.

dantownsend avatar Aug 31 '22 18:08 dantownsend

So, some tests are problematic for Cockroach (anything testing sequential keys starting at 1), because even though we can run Cockroach in "compatibility mode", most Cockroach users will never want to do this because it's terrible for performance in Cockroach (the cluster would have to perform a very expensive "full cluster" consensus on INSERT and essentially have a cluster-wide lock :scream:)

Notably "id": 1 and "manager": 1 here:

band = Band(name="Pythonistas", popularity=1000, manager=manager)
band.save().run_sync()

self.assertEqual(
    Band.select().run_sync(),
    [
        {
            "id": 1,
            "name": "Pythonistas",
            "manager": 1,
            "popularity": 1000,
        }
    ],
)

So we have a few options:

Add to the current test suite.

  • Perhaps some decorators like @cockroach_only and @cockroach_never (needed to exclude tests guaranteed to fail), and have extra tests just for Cockroach.
  • I think this is the way to go, but our test suite will get bigger.
  • Any cleaner ways to approach this?

Run compatibility mode.

  • Add DEFAULT nextval() for Serial at Table creation time just for Cockroach tests.
  • Really not sure how to do this just at Table creation time without the new DEFAULT being used everywhere else and breaking other tests. @dantownsend
  • As explained above, most people are going to want DEFAULT unique_rowid() or UUID, but neither of these will pass the current test suite.
  • Explained here: https://www.cockroachlabs.com/docs/stable/serial.html

Remove sequential keys from the current assert checks.

  • Kinda easy but not very desirable. Mangles current tests.

Ignore large swaths of tests for Cockroach.

  • Easiest but not very desirable.

gnat avatar Aug 31 '22 20:08 gnat

Major work in progress, but figured I'd start pushing up progress so people can try things out as we go.

gnat avatar Aug 31 '22 21:08 gnat

Notably "id": 1 and "manager": 1 here:

Yes, that's a tricky problem.

We currently have an approach for just running certain tests for certain databases:

https://github.com/piccolo-orm/piccolo/blob/ed93feb85050895cc54e7db9650eddc69ab94fa0/tests/base.py#L33-L39

We can create a cockroach_only decorator. It's might not be sufficient though, if we have loads of tests which don't run on Cockroach - we would also need to mark a bunch of tests as not_cockroach. A cleaner approach might be to create a new decorator called for_engines. For example:

@for_engines('sqlite', 'postgres')
class Test1(TestCase):
    ...

But as you said, ignoring large numbers of tests isn't ideal.

Run compatibility mode. Add DEFAULT nextval() for Serial at Table creation time just for Cockroach tests. Really not sure how to do this just at Table creation time without the new DEFAULT being used everywhere else and breaking other tests. @dantownsend As explained above, most people are going to want DEFAULT unique_rowid() or UUID, but neither of these will pass the current test suite. Explained here: https://www.cockroachlabs.com/docs/stable/serial.html

It's a smart idea. I'm not sure how best to implement it though. Whenever we create tables in test, we could do it via a custom function which set the tables up correctly if using cockroach.

class Test1(Table):
    def setUp(self):
        create_test_tables(Band)

I need to think about it a bit more ...

dantownsend avatar Aug 31 '22 22:08 dantownsend

@for_engines('sqlite', 'postgres')
class Test1(TestCase):
    ...

This is smart. Allows us to pick and choose specific tests per engine and easily add other engines in the future. Would allow us to repurpose the Cockroach tests for other massively distributed databases as well.

If you don't mind, what does the implementation of this @for_engines decorator look like? I am having trouble with it. :sweat_smile:

Here's an example of extending our test suite with @for_engines would look like.

Keeps the test suite comprehensive and the division of tests are clear.

I've re-worked that test for Cockroach (it just ignores "id" and "manager") but of course, feedback and ideas on how to do it better are much appreciated.

test_save.py

from unittest import TestCase

from piccolo.table import create_db_tables_sync, drop_db_tables_sync
from tests.example_apps.music.tables import Band, Manager
from tests.base import cockroach_only, no_cockroach


class TestSave(TestCase):
    def setUp(self):
        create_db_tables_sync(Manager, Band)

    def tearDown(self):
        drop_db_tables_sync(Manager, Band)

    def test_save_new(self):
        """
        Make sure that saving a new instance works.
        """
        manager = Manager(name="Maz")

        query = manager.save()
        print(query)
        self.assertTrue("INSERT" in query.__str__())

        query.run_sync()

        names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
        self.assertTrue("Maz" in names)

        manager.name = "Maz2"
        query = manager.save()
        print(query)
        self.assertTrue("UPDATE" in query.__str__())

        query.run_sync()
        names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
        self.assertTrue("Maz2" in names)
        self.assertTrue("Maz" not in names)

    @for_engines('sqlite', 'postgres')
    def test_save_specific_columns(self):
        """
        Make sure that we can save a subset of columns.
        """
        manager = Manager(name="Guido")
        manager.save().run_sync()

        band = Band(name="Pythonistas", popularity=1000, manager=manager)
        band.save().run_sync()

        self.assertEqual(
            Band.select().run_sync(),
            [
                {
                    "id": 1,
                    "name": "Pythonistas",
                    "manager": 1,
                    "popularity": 1000,
                }
            ],
        )

        band.name = "Pythonistas 2"
        band.popularity = 2000
        band.save(columns=[Band.name]).run_sync()

        # Only the name should update, and not the popularity:
        self.assertEqual(
            Band.select().run_sync(),
            [
                {
                    "id": 1,
                    "name": "Pythonistas 2",
                    "manager": 1,
                    "popularity": 1000,
                }
            ],
        )

        # Also test it using strings to identify columns
        band.name = "Pythonistas 3"
        band.popularity = 3000
        band.save(columns=["popularity"]).run_sync()

        # Only the popularity should update, and not the name:
        self.assertEqual(
            Band.select().run_sync(),
            [
                {
                    "id": 1,
                    "name": "Pythonistas 2",
                    "manager": 1,
                    "popularity": 3000,
                }
            ],
        )

    @for_engines('cockroach')
    def test_save_specific_columns(self):
        """
        Make sure that we can save a subset of columns.
        """
        manager = Manager(name="Guido")
        manager.save().run_sync()

        band = Band(name="Pythonistas", popularity=1000, manager=manager)
        band.save().run_sync()

        self.assertEqual(
            Band.select().run_sync()[0] | {
                "name":"Pythonistas",
                "popularity": 1000,
            },
            Band.select().run_sync()[0]
        )

        band.name = "Pythonistas 2"
        band.popularity = 2000
        band.save(columns=[Band.name]).run_sync()

        # Only the name should update, and not the popularity:
        self.assertEqual(
            Band.select().run_sync()[0] | {
                "name":"Pythonistas 2",
                "popularity": 1000,
            },
            Band.select().run_sync()[0]
        )

        # Also test it using strings to identify columns
        band.name = "Pythonistas 3"
        band.popularity = 3000
        band.save(columns=["popularity"]).run_sync()

        # Only the popularity should update, and not the name:
        self.assertEqual(
            Band.select().run_sync()[0] | {
                "name":"Pythonistas 2",
                "popularity": 3000,
            },
            Band.select().run_sync()[0]
        )

gnat avatar Sep 01 '22 00:09 gnat

Another alternative.

We can use if for_engines('sqlite', 'postgres'): and if for_engines('cockroach'): on the individual self.assert..

Direct comparison to the previous for test_save.py. Your thoughts, ideas?

from unittest import TestCase

from piccolo.table import create_db_tables_sync, drop_db_tables_sync
from tests.example_apps.music.tables import Band, Manager
from tests.base import for_engines


class TestSave(TestCase):
    def setUp(self):
        create_db_tables_sync(Manager, Band)

    def tearDown(self):
        drop_db_tables_sync(Manager, Band)

    def test_save_new(self):
        """
        Make sure that saving a new instance works.
        """
        manager = Manager(name="Maz")

        query = manager.save()
        print(query)
        self.assertTrue("INSERT" in query.__str__())

        query.run_sync()

        names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
        self.assertTrue("Maz" in names)

        manager.name = "Maz2"
        query = manager.save()
        print(query)
        self.assertTrue("UPDATE" in query.__str__())

        query.run_sync()
        names = [i["name"] for i in Manager.select(Manager.name).run_sync()]
        self.assertTrue("Maz2" in names)
        self.assertTrue("Maz" not in names)

    def test_save_specific_columns(self):
        """
        Make sure that we can save a subset of columns.
        """
        manager = Manager(name="Guido")
        manager.save().run_sync()

        band = Band(name="Pythonistas", popularity=1000, manager=manager)
        band.save().run_sync()

        if for_engines('sqlite', 'postgres'):
            self.assertEqual(
                Band.select().run_sync(),
                [
                    {
                        "id": 1,
                        "name": "Pythonistas",
                        "manager": 1,
                        "popularity": 1000,
                    }
                ],
            )

        if for_engines('cockroach'):
            self.assertEqual(
                Band.select().run_sync()[0] | {
                    "name":"Pythonistas",
                    "popularity": 1000,
                },
                Band.select().run_sync()[0]
            )

        band.name = "Pythonistas 2"
        band.popularity = 2000
        band.save(columns=[Band.name]).run_sync()

        # Only the name should update, and not the popularity:
        if for_engines('sqlite', 'postgres'):
            self.assertEqual(
                Band.select().run_sync(),
                [
                    {
                        "id": 1,
                        "name": "Pythonistas 2",
                        "manager": 1,
                        "popularity": 1000,
                    }
                ],
            )

        if for_engines('cockroach'):
            self.assertEqual(
                Band.select().run_sync()[0] | {
                    "name":"Pythonistas 2",
                    "popularity": 1000,
                },
                Band.select().run_sync()[0]
            )

        # Also test it using strings to identify columns
        band.name = "Pythonistas 3"
        band.popularity = 3000
        band.save(columns=["popularity"]).run_sync()

        # Only the popularity should update, and not the name:
        if for_engines('sqlite', 'postgres'):
            self.assertEqual(
                Band.select().run_sync(),
                [
                    {
                        "id": 1,
                        "name": "Pythonistas 2",
                        "manager": 1,
                        "popularity": 3000,
                    }
                ],
            )

        if for_engines('cockroach'):
            self.assertEqual(
                Band.select().run_sync()[0] | {
                    "name":"Pythonistas 2",
                    "popularity": 3000,
                },
                Band.select().run_sync()[0]
            )

gnat avatar Sep 01 '22 06:09 gnat

Sorry to interrupt, but we may have additional issues until asyncpg is fully supported in Cockroach. I started running individual tests and a few exceptions are currently not supported. For example, tests for array columns throw exception asyncpg.exceptions._base.UnknownPostgresError: unable to decorrelate subquery. I don't think it's implemented yet issue.

I also haven't found an alternative to the Postgres FORMAT() function used for readable columns. https://github.com/piccolo-orm/piccolo/blob/ed93feb85050895cc54e7db9650eddc69ab94fa0/piccolo/columns/readable.py#L40-L42

There are probably more such situations that we haven't discovered yet.

sinisaos avatar Sep 01 '22 07:09 sinisaos

Yeah I encountered that the other day as well, thought I'd come back to it as it seems non-essential, maybe @dantownsend can comment. We may be able to just stub it for cockroach for now or put in a placeholder.

I don't see many viable alternatives to postgres format() or sqlite printf() in the function reference unless I'm missing something. Odd since there's tons of other string formatting options.

@knz Any thoughts?

gnat avatar Sep 01 '22 08:09 gnat

Instead of format we might be able to use || which concatenates strings and values.

We would have to refactor this:

https://github.com/piccolo-orm/piccolo/blob/ed93feb85050895cc54e7db9650eddc69ab94fa0/piccolo/columns/readable.py#L13

But I think it's possible.

dantownsend avatar Sep 01 '22 08:09 dantownsend

If there are lots of problems with asyncpg, another solution is to build the CockroachDB engine based off psycopg3 instead, which seems to be highly compatible with CockRoachDB.

The engines in Piccolo aren't super complex, so it's possible.

dantownsend avatar Sep 01 '22 09:09 dantownsend

If you don't mind, what does the implementation of this @for_engines decorator look like? I am having trouble with it. 😅

I think it looks something like this:

ENGINE = engine_finder()

def for_engines(*engine_names: str):
    if ENGINE:
        current_engine_name = ENGINE.engine_type
        if current_engine_name not in engine_names:

            def wrapper(func):
                return pytest.mark.skip(
                    f"Not running for {current_engine_name}"
                )(func)

            return wrapper
        else:

            def wrapper(func):
                return func

            return wrapper
    else:
        raise ValueError("Engine not found")

dantownsend avatar Sep 01 '22 09:09 dantownsend

We can use if for_engines('sqlite', 'postgres'): and if for_engines('cockroach'): on the individual self.assert..

We could do that. We would end up with quite a bit of branching in the tests, but in the end it might be the best solution.

I'm still hopefully that there's a super clean solution we haven't thought of yet ... 🤔

dantownsend avatar Sep 01 '22 09:09 dantownsend

We would end up with quite a bit of branching in the tests

This is what I'm experiencing.

Your decorator creates much flatter tests, and the functions themselves are shorter, which is nice.

Also I'm realising I'm needing to do RETURNING id on many of the INSERT, because the id is semi-random, which clutters the postgres and sqlite tests a bit too much for my taste.

Thankfully many tests pass fine unmodified (ideal situation).

gnat avatar Sep 01 '22 09:09 gnat

If there are lots of problems with asyncpg, another solution is to build the CockroachDB engine based off psycopg3 instead, which seems to be highly compatible with CockRoachDB.

The engines in Piccolo aren't super complex, so it's possible.

Interesting. On that note, it seems like @dvarrazzo has committed extra time into ensuring a smooth CRDB story in psycopg: https://github.com/psycopg/psycopg/issues/313

gnat avatar Sep 01 '22 10:09 gnat

@knz Any thoughts?

  • "cannot decorrelate subquery": we're extremely interested to get that fixed. CockroachDB is supposed to be able to decorrelate a lot nowadays, so your use case is probably a bug. Can you tell us more and then file an issue on the crdb repo?

  • format: see here https://github.com/cockroachdb/cockroach/issues/87260 and indeed using || and the ::string cast conversion will help as workaround.

knz avatar Sep 01 '22 12:09 knz

@dantownsend Any advice on how to handle the default type checks tests in tests/apps/migrations/auto/integration/test_migrations.py ?

Example failed test:

    def test_bigint_column(self):
        self._test_migrations(
            table_snapshots=[
                [self.table(column)]
                for column in [
                    BigInt(),
                    BigInt(default=1),
                    BigInt(default=integer_default),
                    BigInt(null=True),
                    BigInt(null=False),
                    BigInt(index=True),
                    BigInt(index=False),
                ]
            ],
            test_function=lambda x: all(
                [
                    x.data_type == "bigint",
                    x.is_nullable == "NO",
                    x.column_default == "0",
                ]
            ),
        )

Fails with:

AssertionError: False is not true : Meta is incorrect: RowMeta(column_default='0:::INT8', column_name='my_column', is_nullable='NO', table_name='my_table', character_maximum_length=None, data_type='bigint', numeric_precision=64, numeric_scale=0, numeric_precision_radix=2)

CockroachDB is adding type casting aka '0:::INT8'

Do I really need to write all new tests with the casts included?

Or is there a way to override or work around this? Suggestions appreciated.

gnat avatar Sep 01 '22 13:09 gnat

Can we do something like:

x.column_default in ("0", "0:::INT8")

Maybe with a comment next to it saying that 0:::INT8 is for CockroachDB.

dantownsend avatar Sep 01 '22 13:09 dantownsend

Very elegant. Thank you!

Also I keep bumping up into ALTER COLUMN TYPE is not supported inside a transaction https://github.com/cockroachdb/cockroach/issues/49351

Apparently there's work being done on this at Cockroach HQ but is there any way to turn off transactions for these ALTER COLUMN TYPE operations just for Cockroach in the short term? Thoughts?

gnat avatar Sep 01 '22 13:09 gnat

ALTER COLUMN TYPE is generally something one wants to avoid in CockroachDB anyway, as it's terrible for performance across a cluster, so perhaps disabling those for Cockroach is the way to go.

gnat avatar Sep 01 '22 13:09 gnat