prql icon indicating copy to clipboard operation
prql copied to clipboard

Snowflake Semi Structured Data incompatibility

Open sercandogan opened this issue 1 year ago • 1 comments

Hey 🙌🏽,

Snowflake supports semi-structured data as variant type and you can access the item by using colon and dot. Please see: https://docs.snowflake.com/en/user-guide/querying-semistructured.html#traversing-semi-structured-data

Example below fails with unexpected argument named:

from sometable
filter json:first_key.second_key == 1 and json:third_key == 50

I see also : used for named args & parameters. How can we support this syntax?

sercandogan avatar Sep 13 '22 20:09 sercandogan

Thanks for the issue @sercandogan

Generally we'd use an s-string for features that PRQL doesn't yet support, like:

from sometable
filter s"json:first_key.second_key" == 1 and s"json:third_key" == 50

Unfortunately this specific case is more difficult — because : is non-standard in SQL, and PRQL runs a formatter over the SQL, we get:

SELECT
  sometable.*
FROM
  sometable
WHERE
  json :first_key.second_key = 1
  AND json :third_key = 50

Note the space in json :first_key. Am I correct that snowflake doesn't like the space? Or is it OK?

We've had this issue a couple of times around s-strings. Ideally we could not format the SQL inside s-strings, but we'd need to think of how to do this — maybe we pass a string with a sentinel and then replace it after the formatting. (We do want to do some formatting, because otherwise it's a very long line, and producing interpretable SQL is a core principle)

max-sixty avatar Sep 13 '22 20:09 max-sixty

Non-ideal but we used to have a similar issue and we worked around it by just always running a regex replacement to remove these spaces automatically straight after compilation to SQL: s/json :/json:/g

mklopets avatar Sep 26 '22 11:09 mklopets

I opened #1284 to address the broader issue around formatting, which subsumes this issue, so I'll close this.

max-sixty avatar Dec 17 '22 19:12 max-sixty