drizzle-orm icon indicating copy to clipboard operation
drizzle-orm copied to clipboard

Feat: Optional database aliases for columns in table declarations

Open L-Mario564 opened this issue 1 year ago • 4 comments

This PR aims to address one of the items defined in the roadmap for v1.

You can now define columns in tables without specifying a database alias. Works for all dialects.

This:

const users = pgTable('users', {
  id: bigserial({ mode: 'number' }).primaryKey(),
  firstName: varchar(),
  lastName: varchar({ length: 50 }),
  admin: boolean()
});

Is now the same as this:

const users = pgTable('users', {
  id: bigserial('id', { mode: 'number' }).primaryKey(),
  firstName: varchar('firstName'),
  lastName: varchar('lastName', { length: 50 }),
  admin: boolean('admin')
});

Some misc. notes for this PR:

  • Moved line and point PG column types into the postgis_extension folder (previously in the main columns folder).
  • The config in custom types (for all dialects) are now required to be an object (Record<string, any>).
  • For PG:
    • Exported PgBigIntConfig, PgBigSerialConfig, PgGeometryConfig interfaces/types.
    • Added PgDateConfig, PgNumericConfig interfaces/types.
  • For MySQL:
    • Exported MySqlBigIntConfig.

(There might be more changes not noted here).

L-Mario564 avatar Aug 04 '24 22:08 L-Mario564

thanks @L-Mario564!

a few comments before I start reviewing:

Moved line and point PG column types into the postgis_extension folder (previously in the main columns folder).

  • This is a part of PostgreSQL types, so it should not be moved to postgis: https://www.postgresql.org/docs/current/datatype-geometric.html

AndriiSherman avatar Aug 06 '24 13:08 AndriiSherman

This one should be changed and I guess we can release it

AndriiSherman avatar Aug 06 '24 14:08 AndriiSherman

thanks @L-Mario564!

a few comments before I start reviewing:

Moved line and point PG column types into the postgis_extension folder (previously in the main columns folder).

  • This is a part of PostgreSQL types, so it should not be moved to postgis: https://www.postgresql.org/docs/current/datatype-geometric.html

@AndriiSherman Done.

L-Mario564 avatar Aug 06 '24 23:08 L-Mario564

@L-Mario564 we've discussed a bit more updates, so we can make this release a full one and ensure all parts of both the ORM and kit are working well

Things, that should be done:

Create a new setting in the Drizzle instance that will define custom casing mappings for column names. We can start with snake_case and camelCase

// In this case, all keys will be automatically mapped to snake_case for database aliases
const db = drizzle(client, { casing: 'snake_case' })
// In this case, all keys will be automatically mapped to camelCase for database aliases
const db = drizzle(client, { casing: 'camelCase' })
// In this case it will work the same as now in your PR
const db = drizzle(client)

As long as this setting resides in a DB instance, the casing mapping should be applied to each query, which may have a significant performance impact. We would need to have a cache with an object mapping a table to an object that represents the column key and database alias to be used. For example: { usersTable: { firstName: first_name } }.

When the DB object is created, this cache is empty, but after the first usage of a table in any query, the cache will be updated and filled with the data.

Make sure to test scenarios for queries with joins and, most importantly, for such queries:

db.select({ customKey: users.firstName, customeKey2: users.lastName }).from(users)

also one important use case would be

db.select({ customKey: sql`${users.firstName}` }).from(users)

In this case we should provide this cache to sql template chunks mapper, so we can reuse names from it. It would be awesome to have it in this way:

  • Inside query building, where we need to use tables and basically any place where a table can be passed - we need to use fillCache.
  • In places where we need to get column names for query string generation - use getCasing().
  • It will be used in SQL template generation and in all query builder functions.
class Casing {
    private cache: Record<string, Record<string,string>> = {}

    constructor(){}

    getCasing(name: string){
        // find this column in cache and retrn it
        // return name if nothing was found
    }

    fillCache(table: AnyTable) {
        // if no table was in cache
        // go through all table columns and fill them inside this.cache

        // if table was in cache - return void
    }
}

// inside PgDatabase(or other databases) class we would need to make
class PgDatabase {
    constructor(
        private casing: Casing = new Casing()
    ) {

    }
}

and then this object can be passed to needed parts of a query lifecycle

Another change should be added to Drizzle-Kit, but let's first add all the changes and features mentioned above. Then I'll guide you through the needed changes for Drizzle-Kit (it will be just about introspection and a new config value)

AndriiSherman avatar Aug 07 '24 09:08 AndriiSherman

I've done some changes based on @AndriiSherman's comment. The DB instance now has a new setting: casing.

const db = drizzle(client, {
  casing: /* 'snake_case': map columns to snake case. 'camelCase': map columns to camel case. undefined: don't do any mapping */
});

Upon providing a value for the above setting, expect a query to look like:

const users = pgTable('users', {
  id: serial().primaryKey(),
  firstName: text().notNull(),
  age: integer('AGE').notNull(),
});

const db = drizzle(client, { casing: 'snake_case' });

db.select({ firstName: users.firstName, age: users.age }).from(users);
// select "first_name", "AGE" from "users"
const users = pgTable('users', {
  id: serial().primaryKey(),
  first_name: text().notNull(),
  age: integer('AGE').notNull(),
});

const db = drizzle(client, { casing: 'camelCase' });

db.select({ first_name: users.first_name, age: users.age }).from(users);
// select "firstName", "AGE" from "users"

In both examples age remains the same since an alias is specified for it.

To improve query performance, each dialect instance stores a cache for the mapped values. It's stored in the dialect instead of the DB instance since QueryBuilder is independant from the DB instance but does share the dialect. This also means that a separate query builder will have a different cache.

const qb = new QueryBuilder({ casing: 'snake_case' });
qb.select().from(posts);
const db = drizzle(client, { casing: 'snake_case' });
db.select().from(users);
// Cache for qb: { 'public.posts.postTitle': 'post_title' }
// Cache for db: { 'public.users.firstName': 'first_name' }

Query builders in callbacks don't suffer from this issue since those stem from the DB instance.

const sq = db.$with('sq').as((qb) => qb.select().from(posts));
db.with(sq).select().from(users);
// One cache that looks like: { 'public.posts.postTitle': 'post_title', 'public.users.firstName': 'first_name' }

On a smaller note, you can also now define column types without imports:

const users = pgTable('users', (t) => ({
  id: t.serial().primaryKey(),
  firstName: t.text().notNull(),
  age: t.integer().notNull(),
}));

L-Mario564 avatar Aug 27 '24 22:08 L-Mario564