`DO ...` statement is normalized as an unparsable string
Hi,
We need to parse normalized queries but we have an issue when we .normalize a DO ... statement since 2.2.0:
irb(main):001:0> normalized_sql = PgQuery.normalize("DO $$BEGIN RAISE NOTICE 'hello'; END$$;")
=> "DO $1;"
irb(main):002:0> PgQuery.parse(normalized_sql)
/Users/flora.briand/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/pg_query-2.2.0/lib/pg_query/parse.rb:3:in `parse_protobuf': syntax error at or near "$1" (scan.l:1236) (PgQuery::ParseError)
from /Users/flora.briand/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/pg_query-2.2.0/lib/pg_query/parse.rb:3:in `parse'
from (irb):4:in `<main>'
from /Users/flora.briand/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>'
from /Users/flora.briand/.rbenv/versions/3.1.2/bin/irb:25:in `load'
from /Users/flora.briand/.rbenv/versions/3.1.2/bin/irb:25:in `<main>'
We did not have the issue in pg_query 2.1.4:
irb(main):001:0> normalized_sql = PgQuery.normalize("DO $$BEGIN RAISE NOTICE 'hello'; END$$;")
=> "DO $$BEGIN RAISE NOTICE 'hello'; END$$;"
irb(main):002:0> PgQuery.parse(normalized_sql)
=> #<PgQuery::ParserResult:0x0000000112e57b90
@aliases=nil,
@cte_names=nil,
@functions=nil,
@query="DO $$BEGIN RAISE NOTICE 'hello'; END$$;",
@tables=nil,
@tree=
<PgQuery::ParseResult: version: 130003, stmts: [<PgQuery::RawStmt: stmt: <PgQuery::Node: do_stmt: <PgQuery::DoStmt: args: [<PgQuery::Node: def_elem: <PgQuery::DefElem: defnamespace: "", defname: "as", arg: <PgQuery::Node: string: <PgQuery::String: str: "BEGIN RAISE NOTICE 'hello'; END">>, defaction: :DEFELEM_UNSPEC, location: 3>>]>>, stmt_location: 0, stmt_len: 38>]>,
@warnings=[]>
@fbriand-docto Thanks for reaching out!
This was an intentional change in 2.2.0 to avoid data privacy-related issues with generated DO statements that contain personally identifiable information, such as produced by Sequelize (the Node.js ORM).
However I can see how the fact that this no longer parses is not ideal. I'll review how we can adjust the parser to support this case (we already have a parser patch for a few other cases).
Out of curiosity, what is your use case for pg_query, and do you typically work with DO statements a lot?
@lfittl Thanks for your answer.
We use pg_query to analyze the queries, so for example to know if the query is a select or a write.
AFAIK we have 2 usages of the DO statement which are both used in the initialization of our tests. One to setup a read-only user and the other one to empty all tables.