snuba icon indicating copy to clipboard operation
snuba copied to clipboard

draft(MQL): Parse multi-entity type formulas

Open enochtangg opened this issue 1 year ago • 4 comments

This is a draft PR for implementing what parsing multi-entity type formulas might look like in snuba.

Depends on:

  • https://github.com/getsentry/snuba-sdk/pull/169

This change breaks up the parsing pipeline into multiple sub-tasks:

  1. parse_mql_query_body(): Parses the MQL into the InitialParseResult tree
  2. populate_query_ast_initial(): Creates LogicalQuery | CompositeQuery objects with the appropriate expressions. If multi-type formula query, calls some processing function (process_table_names_in_selected_expressions()) which traverses AST and adds the appropriate join table names to the expressions.
  3. populate_query_from_mql_context Parses MQL Context and transforms fields into expressions that inject into the AST. If multi-type formula query, patch the appropriate join table names.

To see preview of CompositeQuery AST (with joins), run pytest tests/query/parser/test_formula_mql_query.py::test_multi_type_formula -s -vv

enochtangg avatar Feb 02 '24 22:02 enochtangg

How does resolving fit into this flow?

evanh avatar Feb 05 '24 21:02 evanh

How does resolving fit into this flow?

I'm wondering if resolving can function as-is, since processing a multi-type formula query only involves updating the table_name on certain expressions (e.g. Column). From what I can tell, the resolution step is only concerned with updating the names and values of those expressions (e.g. column name) and can operate separately?

enochtangg avatar Feb 06 '24 02:02 enochtangg

Reading over this, I think the refactor proposed is generally good. However, there is no requirement to keep using -If for single type formulas. And in fact, we can't support time spent percentage if we keep doing that anyways.

Given that's the case, we can and probably should rewrite the how formulas get parsed. There's no need to parse the initial SelectedExpressions with -If for example, and that simplifies some of the JOIN parsing code.

evanh avatar Feb 06 '24 21:02 evanh

There's no need to parse the initial SelectedExpressions with -If for example, and that simplifies some of the JOIN parsing code.

I agree. This was something I was also contemplating, it would be nice if we just had one mechanism (JOINs) for all formulas. One thing we should assess is the performance of these join queries in production. While the -If queries are not performant to begin with, are we sure that JOINs won't be significantly worse before we make this refactor?

enochtangg avatar Feb 07 '24 17:02 enochtangg