graphql-engine icon indicating copy to clipboard operation
graphql-engine copied to clipboard

BIGQUERY: invalid sql generated when querying a table computed field

Open BenoitRanque opened this issue 1 year ago • 1 comments

Version Information

Server Version: 2.6.0

Environment

EE

What is the current behaviour?

when we have 3 nested nodes with the middle node being a table computed field relationship, the SQL query generated by Hasura is not valid.

In the following example, we have a relationship of unifiedloanuniverse_all to parcel_boundary using a table computed field and the relationship from parcel_boundary to tax_assessor using an object relationship, with that, when we run the query we get the following error:

Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN.

See the graphql query and the SQL query below:

query getLoanUniverseData {
  unifiedloanuniverse_all(limit: 10) {
    zip
    latitude
    longitude
    parcel_boundary__geom(args: {}) {
      tax_assessor__tax_assessor_id {
        one_line_address
        tax_assessor_id
      }
    }
  }
}
SELECT
  `t_unifiedloanunivers1`.`zip` AS `zip`,
  `t_unifiedloanunivers1`.`latitude` AS `latitude`,
  `t_unifiedloanunivers1`.`longitude` AS `longitude`,
  ARRAY( (
    SELECT
      AS STRUCT `get_unifiedloanunive1`.`tax_assessor_id` AS `tax_assessor_id`,
      STRUCT( IFNULL(`or_tax_assessor__tax1`.`one_line_address`, NULL) AS `one_line_address`,
        IFNULL(`or_tax_assessor__tax1`.`tax_assessor_id`, NULL) AS `tax_assessor_id` ) AS `tax_assessor__tax_assessor_id`
    FROM
      UNNEST( ARRAY( (
          SELECT
            AS STRUCT *
          FROM
            `display_220829090040`.`get_unifiedloanuniverse_all__to__parcel_boundary__geom`( `p_cherre_geo_point` => `t_unifiedloanunivers1`.`cherre_geo_point` ) AS `get_unifiedloanunive1` ) ) ) AS `get_unifiedloanunive1`
    LEFT OUTER JOIN (
      SELECT
        `t_tax_assessor1`.`tax_assessor_id` AS `tax_assessor_id`,
        `t_tax_assessor1`.`one_line_address` AS `one_line_address`
      FROM
        `display_220829090040`.`tax_assessor` AS `t_tax_assessor1` ) AS `or_tax_assessor__tax1`
    ON
      ( `or_tax_assessor__tax1`.`tax_assessor_id` = `get_unifiedloanunive1`.`tax_assessor_id` )
    LIMIT
      10 ) ) AS `parcel_boundary__geom`
FROM
  `display_220829090040`.`unifiedloanuniverse_all` AS `t_unifiedloanunivers1`
LIMIT
  10

Obs: I formatted the sql query to use less space

Executing the same type of nesting relationships but with array or object instead of table computed field works fine.

What is the expected behaviour?

Hasura should generate valid SQL

BenoitRanque avatar Sep 01 '22 18:09 BenoitRanque

Ref conversation

dsandip avatar Sep 13 '22 00:09 dsandip

Am able to confirm this issue persists as of hasura 2.12.0

BenoitRanque avatar Sep 28 '22 18:09 BenoitRanque

UPDATE: we have a minimal repro, so you can ignore my original message below.

Hey Benoit, thanks for this bug report! We're looking into this a priority, and working on a minimal repro, but the following would help us along faster:

  1. Can I confirm my understanding that parcel_boundary__geom is the computed field in question. If so, please could you share its definition?
  2. The DDL statements to set up this example
  3. A copy of the metadata

In this issue or by email/internal comms is fine. Thank you!

sassela avatar Sep 30 '22 09:09 sassela

@sassela Apologies, only seeing this now. You should be able to repro this issue with any query of the form

query {
  table1 {
    computed_table_field {
      table2 {
        scalar_field
      } 
    }
  }
}

Where table1 is a table, computed_table_field is a computed table field on table1, and table2 is another table accessed by navigating a relationship from the table returned by computed_table_field

I was able to reproduce this with a very minimal schema.

BenoitRanque avatar Oct 03 '22 11:10 BenoitRanque

Yep we got it, thanks @BenoitRanque ! We were able to repro, test and find the root issue (internal team context). I'm confirming with the product eng team how to proceed and will update here when we have a definitive answer.

sassela avatar Oct 03 '22 14:10 sassela

Hey Benoit, latest update. We've:

  • Isolated the error to the LEFT (OUTER) JOIN fragment in the above generated SQL.
  • Identified the root cause of the “correlated subqueries” error: the combination of LEFT JOIN and a query on computed fields which returns something from a given relationship.
  • Tried out a few equivalent SQL queries, but these fail for the same reason as long as these conditions hold.
  • Confirmed equivalent queries valid in at least Postgres. To the best of our knowledge, this error is unique to BigQuery and their subquery evaluation rules for performance reasons.
  • Decided not to try to workaround this error right now; to the best of our knowledge, doing so would violate the native DB semantics and what looks to be an intentional design decision.

I'll close this issue and create an internal one for our team to improve UX and/or error messaging around these errors. If you have any suggestions about UX improvements, I'd welcome them. In any case, thanks again for this bug report.

FYI @ajohnson1200 @codedmart

sassela avatar Oct 05 '22 17:10 sassela