libpg_query icon indicating copy to clipboard operation
libpg_query copied to clipboard

[feature request] Getting the parse tree of a PlannedStmt

Open reteps opened this issue 7 months ago • 4 comments

It would be excellent if there was some way for libpg_query to support custom operators.

For example, providing some way to update the 'internal lexer', by passing in SQL expressions that modify the state of the parser.

https://github.com/pgvector/pgvector/blob/7b583523365f4b2ea7ec7ea2feab6836267c046e/sql/vector--0.4.4--0.5.0.sql#L29-L32

CREATE OPERATOR CLASS vector_l2_ops
	FOR TYPE vector USING hnsw AS
	OPERATOR 1 <-> (vector, vector) FOR ORDER BY float_ops,
	FUNCTION 1 vector_l2_squared_distance(vector, vector);

Then, at a later point, the <-> is recognized. I'm not entirely sure how this works with Postgres typically for this to be supported.

reteps avatar May 09 '25 20:05 reteps

Thanks for reaching out!

In my understanding adding custom operators doesn't actually modify the parser - just the parse analysis process that operates on the raw parse output. Since libpg_query stops before that (the returned output is the raw parse tree), there should be no modifications necessary.

Do you have an example of a query that's not parsing as expected?

lfittl avatar May 09 '25 20:05 lfittl

Ah -- my apologies! I had assumed that it would parse the operators beyond

                  "A_Expr": {
                    "kind": "AEXPR_OP",
                    "name": [
                      {
                        "String": {
                          "sval": "<->"
                        }
                      }
                    ],

and

                  "A_Expr": {
                    "kind": "AEXPR_OP",
                    "name": [
                      {
                        "String": {
                          "sval": "<="
                        }
                      }
                    ],

I just compiled a test program and it worked as intended.

Would you be able to provide some insight into how postgres uses this parse tree to do actual operator handling? Does it parse this again into another parse tree with explicit operator nodes like LESSTHAN_EQ_OP?

In any case, feel free to close this, this was a misunderstanding on my part.

reteps avatar May 09 '25 20:05 reteps

The explicit feature ask would then be the tree at Plan time, i.e. the PlannedStmt node representation.

reteps avatar May 09 '25 21:05 reteps

The explicit feature ask would then be the tree at Plan time, i.e. the PlannedStmt node representation.

Makes sense!

That is not something we are planning to add to libpg_query, since it would involve significantly more code than what the library does today, which is focused on parsing, not planning.

For transparency, we have a variant of this inside the pganalyze app, see our post on that from a few years ago, not currently open-source (because of maintenance burden if we were to make it more generally useful, as well as business/competitive reasons), and its a very different story in terms of complexity. To make planning work you also need to provide the schema (not just the query) so the query can be interpreted correctly.

lfittl avatar Jun 26 '25 19:06 lfittl