sqlc icon indicating copy to clipboard operation
sqlc copied to clipboard

Add support for CockroachDB

Open danthegoodman1 opened this issue 3 years ago • 15 comments

When trying to use sqlc for our project I noticed a few things were not supported that we require. We do use CockroachDB with the postgres engine.

  1. The ON UPDATE expression caused SQL validation errors. This is specific to CRDB.
  2. Hash sharded indexes caused SQL validation errors as well
  3. Composite primary keys with ordering like PRIMARY KEY(col1, col2 DESC) said that DESC was an error
  4. The STORING clause was also found to be an error

danthegoodman1 avatar Jan 24 '22 13:01 danthegoodman1

I feel like you don't need to tell sqlc about the exact schema because it doesn't ultimately manage the schema. All of the features you just mentioned are schema features and not query features. Like, you can imagine telling sqlc about a table that doesn't use any of that syntax and then all the queries should still work, right?

ajwerner avatar Jan 24 '22 15:01 ajwerner

@ajwerner That is true, but now I need to have SQL generating from a file that is not the true SQL of our schema, which means a second place to manage SQL and manually ensuring that it is consistent with our true SQL schema (which could lead to errors, which sqlc is designed to avoid)

For example we have the following primary key:

PRIMARY KEY(player_id, id DESC) 

sqlc does not like the DESC part of it (considers invalid SQL, which is weird because this works in postgres too...)

Removing the DESC would now be an incorrect version of the table.

Another example is this column:

updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ON UPDATE NOW(), -- auto update the timestamp

We would have to maintain another copy of that that that removes the ON UPDATE. I could see the confusion of developers now manually updating this column... maybe I need big warnings at the top of the fake schema file :)

Initially I think the risk of losing consistency between the SQL schema and the actual schema used is not worth the extra time writing the SQL wrapper functions manually in this case. Since we always have to update the code in 2 places: Either the schema and the sqlc compatible schema, or the schema and the functions that use the affected parts of the schema.

TL;DR I believe you make an excellent point, but I am still considering the balance of maintaining 2 schemas vs a schema and hand written functions.

danthegoodman1 avatar Jan 24 '22 16:01 danthegoodman1

In reality this isn't something we should be constantly iterating on anyway... so maybe that leans more in favor of doing the second fake sqlc schema...

danthegoodman1 avatar Jan 24 '22 16:01 danthegoodman1

After a bit of testing we can actually generate this fake sqlc schema from the real schema as part of the makefile by just copying, and removing things like DESC, ON UPDATE NOW(), and any index lines.

Definitely not ideal but it removes the manual maintaining of 2 schemas

danthegoodman1 avatar Jan 24 '22 16:01 danthegoodman1

I feel like you don't need to tell sqlc about the exact schema because it doesn't ultimately manage the schema.

sqlc expects to have an accurate view of your schema to power compile-time checks. Maintaining two sets of schemas is not something I want to encourage people to do unless it's working around a current issue.

We do use CockroachDB with the postgres engine.

As long as the syntax you're using is valid PostgreSQL, sqlc should be able to handle it. We don't support any CockroachDB extensions at this time.

Since this issue is actually four different bug reports, I'm going to make four new issues for each one and then close this one out.

kyleconroy avatar Jan 24 '22 17:01 kyleconroy

@kyleconroy

sqlc expects to have an accurate view of your schema to power compile-time checks. Maintaining two sets of schemas is not something I want to encourage people to do unless it's working around a current issue.

Table wise the additional generation I would do wouldn't change anything, it would only remove sorting and auto updating. That doesn't change anything about the column data types, just default values which sqlc does not seem to care about and leave to the query writer.

I do agree it's not ideal, but it's a current workaround that doesn't break sqlc.

danthegoodman1 avatar Jan 24 '22 17:01 danthegoodman1

For example, this part of a table goes from

  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ON UPDATE NOW(), -- auto update the timestamp
  ttl TIMESTAMPTZ NOT NULL,
  tokens INT64 NOT NULL,
  PRIMARY KEY(player_id, ttl DESC, id DESC)

to

 updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- auto update the timestamp
  ttl TIMESTAMPTZ NOT NULL,
  tokens INT64 NOT NULL,
  PRIMARY KEY(player_id, ttl, id)

danthegoodman1 avatar Jan 24 '22 17:01 danthegoodman1

As long as the syntax you're using is valid PostgreSQL, sqlc should be able to handle it. We don't support any CockroachDB extensions at this time.

At the end of the day, this is the root of the problem. Some of these are cockroach-specific syntax extensions. There's talk afoot of making the USING HASH syntax actually postgres compliant. I appreciate that you're spinning out issues to support the various features. Here's some color on each of them. I'll move the commentary over to the issues.

  1. The ON UPDATE expression caused SQL validation errors. This is specific to CRDB.

This is wholly cockroach-exclusive syntax added in 21.2. We added it pragramatically to side-step not having generalized triggers. It's very valuable, but definitely not postgres syntax. #1391

  1. Hash sharded indexes caused SQL validation errors as well

We're removing the need to specify the BUCKET_COUNT in 22.1, which, I think, when omitting that parameter, will be postgres compatible syntax.

  1. Composite primary keys with ordering like PRIMARY KEY(col1, col2 DESC) said that DESC was an error

This is a cockroach extension.

  1. The STORING clause was also found to be an error

This is a cockroach extension.

ajwerner avatar Jan 24 '22 17:01 ajwerner

All four of these are CockroachDB extensions or features. Primary key ordering isn't a PostgreSQL feature, at least not as of PostgreSQL.

kyleconroy avatar Jan 24 '22 17:01 kyleconroy

Sorry about the primary key ordering comment. Yeah, these are all extensions.

ajwerner avatar Jan 24 '22 17:01 ajwerner

Another extension that SQLC would just need to keep verbatim is AS OF SYSTEM TIME, which is used like this:

SELECT * FROM codes
AS OF SYSTEM TIME follower_read_timestamp()
WHERE code = $1

Hades32 avatar May 10 '22 09:05 Hades32

CockroachDB tables also have a special system column crdb_internal_mvcc_timestamp[1] which should be selectable. I'm curious if system columns in postgresql are currently supported.

1: https://github.com/cockroachdb/cockroach/pull/51494

tnarg avatar May 23 '22 17:05 tnarg

Much work has been done in recents months to make the cockroach SQL parser depend on less code. It's newly plausible that we'll have a go get-able parser one day in the not-so-distant future. I imagine that'll be a prerequisite to getting first-class support.

ajwerner avatar May 23 '22 17:05 ajwerner

Much work has been done in recents months to make the cockroach SQL parser depend on less code. It's newly plausible that we'll have a go get-able parser one day in the not-so-distant future.

@ajwerner I'm really excited about this! I've got a schema diffing tool that I'd like to open source, but I think it would be more appropriate to wait for your parser to be go get'able before I do.

nickjackson avatar May 23 '22 18:05 nickjackson

Another feature for the list: UPSERT. From slqc's point of view it's equivalent to INSERT

Hades32 avatar Aug 04 '22 09:08 Hades32

@nickjackson the parser library now exists! See https://github.com/cockroachdb/cockroachdb-parser.

ajwerner avatar Nov 08 '22 04:11 ajwerner

We're big fans of sqlc at @muxinc. However we're also big users of CockroachDB, and have recently started hitting some of the pain points of using the two in combination. We've already been manually managing a 2nd Postgres-compatible schema definition to power sqlc, and that's worked ok so far. Of course we'd love to avoid doing that.

More recently, however, we've started to be hit more directly by the need for CockroachDB syntax support. CockroachDB's planner is fairly limited in its index selection abilities as compared to Postgres, so we often find that we need to explicitly specify the target index for a query via index hints (SELECT FROM my_table@my_index_name) in order for the query to take advantage of the desired index. Without this, we are unable to safely convert some legacy code to sqlc because we can't rely on CockroachDB to take advantage of the correct secondary index on its own 😕

Definitely count us in as a big 👍 for this feature :pray:

bgentry avatar Nov 28 '22 17:11 bgentry

The main reason I haven’t prioritized Cockroach support in the past was the difficulty in accessing just the parser and the licensing issues. It sounds like both are now fixed.

Will need to do a bit of investigating 🕵️‍♀️

kyleconroy avatar Nov 30 '22 08:11 kyleconroy

Here's something I came across while evaluating sqlc:

schema.sql

CREATE TABLE t
(
    id         UUID   NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
    mixed_caps STRING NOT NULL,
    lower_case STRING NOT NULL AS (lower(mixed_caps)) STORED
)
schema.sql:5:33: syntax error at or near "AS"

russoj88 avatar Jul 15 '23 00:07 russoj88

@russoj88 I think for this one you can do it like:

CREATE TABLE t
(
    id         UUID   NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
    mixed_caps STRING NOT NULL,
    lower_case STRING NOT NULL GENERATED ALWAYS AS (lower(mixed_caps)) STORED
)

ajwerner avatar Jul 15 '23 03:07 ajwerner

@ajwerner that did work. Thanks!

russoj88 avatar Jul 15 '23 05:07 russoj88