fix: robustly strip psql meta commands
fix(compiler): robustly strip psql meta commands without breaking SQL
Replace naive line-based removal with a single-pass state machine that correctly distinguishes psql meta-commands from backslashes in SQL code, literals, and comments.
The previous implementation would incorrectly strip any line starting with a backslash, breaking valid SQL containing:
- Backslashes in string literals (e.g.
E'\\n', escape sequences) - Meta-command text in comments or documentation
- Dollar-quoted function bodies with backslash content
Changes:
- Track parsing state for single quotes, dollar quotes, and block comments
- Only remove backslash commands at true line starts outside any literal context
- Properly handle escaped quotes (
''), nested block comments (/* /* */ */) - Support dollar-quoted tags with identifiers (e.g.
$tag$...$tag$) - Add comprehensive test suite covering:
- All documented psql meta-commands (
\connect,\set,\d*, etc.) See PostgreSQLpsqldocs - String literals with backslashes and nested quotes
- Dollar-quoted blocks with various tag formats
- Nested block comments containing meta-command text
- Edge cases: empty input, whitespace-only, missing newlines
- All documented psql meta-commands (
Performance improvements:
- Pre-allocate output buffer with
strings.Builder.Grow() - Single pass eliminates redundant string operations
- Reduces allocations by avoiding intermediate line slices
Testing
go test ./internal/compiler- 100% test coverage of new function
removePsqlMetaCommands()
Addresses gbarr's comment in https://github.com/sqlc-dev/sqlc/pull/4082 which closes https://github.com/sqlc-dev/sqlc/issues/4065
@andrewmbenton please review
Thanks @ignat980 this looks a much more complete solution than I was expecting
@ignat980 if you open this against main I can get this merged. Just make sure to includes Andrew's commits.
@kyleconroy Thanks! I rebased to latest sqlc/main and changed this PR's merge-into branch as sqlc/main. Just waiting on the test CI to finish