Option to execute the migration using an SQL file
Hello,
I really appreciate how well pgroll works with JSON files. It would be even more convenient if there was an option to specify an SQL file for migration.
For instance, it would be great if you could use the following command:
./pgroll --postgres-url "postgres://{user}:{password}@{IP}:{port}/{DB}" start 01_create_table.sql --label create_table_11102023 --schema stateless_user_info
This way, you could create a schema named 'stateless_user_info_create_table_11102023' while keeping everything else the same.
Best regards, Althaf
++ It would be nice to have a conversion layer that is able to reads raw SQL migrations and outputs the same thing expressed as pgroll operations.
btw what we have as of today is a way to execute any SQL migration, it's just we don't guarantee zero downtime for those: https://github.com/xataio/pgroll/tree/main/docs#raw-sql
In a scenario where you want to switch migration suites the raw SQL option although nice doesn't allow to have an initializer migration to run an idempotent schema migration.
Is there a way to do something along the lines of:
{
"sql": {
"script": "<file.sql>"
}
}
We have a raw SQL operation that allows you to run any arbitrary SQL for the migration: https://github.com/xataio/pgroll/tree/main/docs#raw-sql. It requires inlining the SQL statements, but otherwise, it looks pretty similar to what you are describing 🙂
I've a crazy idea. What about creating an LLM prompt to translate the SQL statement into the JSON file format? If you like the idea, I can give it a try.
@EmadMokhtar That's an interesting idea! We recently discussed the possibility of doing something like this. If we are able to convert any arbitrary SQL DDL to pgroll format it would be a step forward. In some cases, it's possible that the user still needs to provide some extra info (like the up/down functions), but overall it would make for a better story.
Converting DDL to pgroll format should be pretty simple using pglast in python:
>>> from pglast import parse_sql
>>> stmt = parse_sql('alter table t1 ALTER COLUMN c5 TYPE TEXT;')[0].stmt
>>> from pprint import pprint
>>> pprint(stmt(skip_none=True))
{'@': 'AlterTableStmt',
'cmds': ({'@': 'AlterTableCmd',
'behavior': {'#': 'DropBehavior',
'name': 'DROP_RESTRICT',
'value': 0},
'def_': {'@': 'ColumnDef',
'generated': '\x00',
'identity': '\x00',
'inhcount': 0,
'is_from_type': False,
'is_local': False,
'is_not_null': False,
'location': 28,
'storage': '\x00',
'typeName': {'@': 'TypeName',
'location': 36,
'names': ({'@': 'String', 'sval': 'text'},),
'pct_type': False,
'setof': False,
'typemod': -1}},
'missing_ok': False,
'name': 'c5',
'num': 0,
'recurse': False,
'subtype': {'#': 'AlterTableType',
'name': 'AT_AlterColumnType',
'value': 24}},),
'missing_ok': False,
'objtype': {'#': 'ObjectType', 'name': 'OBJECT_TABLE', 'value': 41},
'relation': {'@': 'RangeVar',
'inh': True,
'location': 12,
'relname': 't1',
'relpersistence': 'p'}}
More information than needed by pgroll. In this case, I think you'd only need the relname for the table, name for the column, subtype.name for the operation, and def_.typeName.names[-1].sval for the new data type.
pglast can do the same for a sequence of DDL statements, up to a complete DB dump file.