libpg_query
libpg_query copied to clipboard
A_Const parse output not ideal
See here the following A_Const node output: (all cases adhere to this pattern)
{"A_Const":{"ival":{"ival":1},"location":7}}
Ideal output
I believe a more ideal output would be something like:
{"A_Const":{"val":{"Integer":{"ival":1}},"location":7}}
This would match the protobuf definition, I think?
https://github.com/pganalyze/libpg_query/blob/680f5ee67c6fdae497c8d1edfadd02b9b8eac74f/protobuf/pg_query.proto#L314-L325
Note that this appears to have been the output back in April 2021, as evidenced by this issue. I haven't tried to pinpoint where/when this behavior changed.
Root cause
I'm not familiar with the codebase, but this appears to be where the problem occurs: https://github.com/pganalyze/libpg_query/blob/680f5ee67c6fdae497c8d1edfadd02b9b8eac74f/src/pg_query_outfuncs_protobuf.c#L192-L198
Note that it's not just a problem with integers, but with all value types supported by A_Const.
Yeah, I agree it would be more logical to have this emitted in JSON with an explicit node like "Integer", instead of just the value.
I tried to do a related modification that avoids the ival duplication in #246, but realized that the problem there is that we currently have validations that the JSON output of libpg_query matches what a Protobuf library emits as JSON representation of the Protobuf (specifically the protobuf-cpp library, since that's an official one published as part of the main set of libraries that has JSON output support).
We could of course say that's a bad idea, and make the JSON output make more sense of its own, and drop the protobuf-cpp validation step.
Is the current format causing specific issues for parsing on your end, or was this more out of curiosity?
For my Node.js binding to libpg_query, I generate TypeScript definitions with the aid of libpg_query's srcdata files, which of course don't necessarily match the JSON output. The type mismatch led to a runtime error that revealed this idiosyncrasy to me.
Thanks, makes sense!
I do think we could reconsider what to do here, but will probably leave as-is for now since its not an urgent issue (rather an oddity).
There is also a future consideration on how a potential parser library in Postgres upstream could look like, and I think the JSON format would certainly be revisited as part of any such effort (there is an argument that JSON is a better fit for an upstream implementation than Protobuf).