pgroll icon indicating copy to clipboard operation
pgroll copied to clipboard

Option to execute the migration using an SQL file

Open althaf004 opened this issue 2 years ago • 6 comments

Hello,

I really appreciate how well pgroll works with JSON files. It would be even more convenient if there was an option to specify an SQL file for migration.

For instance, it would be great if you could use the following command:

./pgroll --postgres-url "postgres://{user}:{password}@{IP}:{port}/{DB}" start 01_create_table.sql --label create_table_11102023 --schema stateless_user_info

This way, you could create a schema named 'stateless_user_info_create_table_11102023' while keeping everything else the same.

Best regards, Althaf

althaf004 avatar Oct 11 '23 03:10 althaf004

++ It would be nice to have a conversion layer that is able to reads raw SQL migrations and outputs the same thing expressed as pgroll operations.

btw what we have as of today is a way to execute any SQL migration, it's just we don't guarantee zero downtime for those: https://github.com/xataio/pgroll/tree/main/docs#raw-sql

exekias avatar Oct 11 '23 07:10 exekias

In a scenario where you want to switch migration suites the raw SQL option although nice doesn't allow to have an initializer migration to run an idempotent schema migration.

Is there a way to do something along the lines of:

{
  "sql": {
    "script": "<file.sql>"
  }
}

mobinni avatar Oct 20 '23 00:10 mobinni

We have a raw SQL operation that allows you to run any arbitrary SQL for the migration: https://github.com/xataio/pgroll/tree/main/docs#raw-sql. It requires inlining the SQL statements, but otherwise, it looks pretty similar to what you are describing 🙂

exekias avatar Oct 23 '23 08:10 exekias

I've a crazy idea. What about creating an LLM prompt to translate the SQL statement into the JSON file format? If you like the idea, I can give it a try.

EmadMokhtar avatar Apr 19 '24 22:04 EmadMokhtar

@EmadMokhtar That's an interesting idea! We recently discussed the possibility of doing something like this. If we are able to convert any arbitrary SQL DDL to pgroll format it would be a step forward. In some cases, it's possible that the user still needs to provide some extra info (like the up/down functions), but overall it would make for a better story.

exekias avatar Apr 22 '24 10:04 exekias

Converting DDL to pgroll format should be pretty simple using pglast in python:

>>> from pglast import parse_sql
>>> stmt = parse_sql('alter table t1 ALTER COLUMN c5 TYPE TEXT;')[0].stmt
>>> from pprint import pprint
>>> pprint(stmt(skip_none=True))
{'@': 'AlterTableStmt',
 'cmds': ({'@': 'AlterTableCmd',
           'behavior': {'#': 'DropBehavior',
                        'name': 'DROP_RESTRICT',
                        'value': 0},
           'def_': {'@': 'ColumnDef',
                    'generated': '\x00',
                    'identity': '\x00',
                    'inhcount': 0,
                    'is_from_type': False,
                    'is_local': False,
                    'is_not_null': False,
                    'location': 28,
                    'storage': '\x00',
                    'typeName': {'@': 'TypeName',
                                 'location': 36,
                                 'names': ({'@': 'String', 'sval': 'text'},),
                                 'pct_type': False,
                                 'setof': False,
                                 'typemod': -1}},
           'missing_ok': False,
           'name': 'c5',
           'num': 0,
           'recurse': False,
           'subtype': {'#': 'AlterTableType',
                       'name': 'AT_AlterColumnType',
                       'value': 24}},),
 'missing_ok': False,
 'objtype': {'#': 'ObjectType', 'name': 'OBJECT_TABLE', 'value': 41},
 'relation': {'@': 'RangeVar',
              'inh': True,
              'location': 12,
              'relname': 't1',
              'relpersistence': 'p'}}

More information than needed by pgroll. In this case, I think you'd only need the relname for the table, name for the column, subtype.name for the operation, and def_.typeName.names[-1].sval for the new data type.

pglast can do the same for a sequence of DDL statements, up to a complete DB dump file.

chris-braidwell avatar Jun 04 '24 17:06 chris-braidwell