Add support for Snowflake Language
This is first draft to solve #366 . It does not work yet, mainly because I had some troubles with the QuoteTypes.
Review
Please review, have a look at the following things:
TODOcomments show lines where I'm totally lost- I'm very unsure about the
reservedComments, thefunctionsand thekeywords. - Did I put the correct things in there, or do they belong somewhere else?
- Please check the places where I used the
[thing] {a|b|c} ...syntax, if I used them correctly. - Did I miss any tests?
- Any other places that have to be changed?
Questions
I want to support accessing elements in unstructured data.
This can be done via square brackets [,] or via colons : , like here:
select src:salesperson.name from car_sales
select src['salesperson']['name'] from car_sales
How should I do this? I added : as an operator, is this right? How about the brackets?
Thanks for this Repo!
Thanks for the effort.
Regarding your main question about colon and square-bracket syntax.
- For brackets, adding
extraParens: ['[]']should do the trick. See for example inn1ql.formatter.ts - With colon, things are a bit trickier. You should indeed add it as an operator. However, currently the default is to format colon with a space after (as in
name: value). To fix this, a language-specific condition should be added toExpressionFormatter.formatOperator()- there already is a similar thing for@operator in PL/SQL.
In general
It would be really helpful if you could also update our wiki with information about Snowflake SQL syntax. Otherwise I'll have to do all this research on my own before I can merge this PR.
I will do all the stuff you mentioned.
Please go other the TODO comments in the changes. I can't run tests because I don't understand the Quote types yet...
Regarding quote types, see: https://github.com/sql-formatter-org/sql-formatter/blob/master/src/lexer/regexFactory.ts#L99
The -qq and -bs suffixes at the end of quote types define what sort of escaping (if any) is used inside the quotes.
Some comments about wiki updates:
-
Parameters page says that Snowflake supports
$1parameters and links to select docs. I'm pretty sure these are something else than parameters one would use in prepared statements. I think these things are more similar to variables - they also share a very similar syntax. It's probably better to reference this syntax from Variables page. Probably the simplest path for supporting this syntax is also to treat them the same as variables - at lest that would be my first gut-instinct-guess. -
Comments page mentions support for
# commentsbut says these aren't actually supported. Syntax highlighting really shouldn't be used as a guide for what is and isn't supported. So probably best to be removed. Too bad you haven't found any official documentation for the comment syntax. Found one Knowledge Base page that at least mentions the//syntax - we could link to that.
I updated all the wiki pages for snowflake (also updated your suggestions).
One problem is the Window-Clause. Snowflake seems to not have support for a WINDOW clause in it's Query Syntax. There is QUALIFY but I'm unsure if this is the same?
However, there is support for Window Functions but also here I'm unsure if it is the same as in the other languages.
Could you have a look into this?
I will rewriting the code now.
Thanks. Updated the docs for Snowflake window functions syntax.
BTW. I would appreciate if you didn't do further force-pushes to this branch. They make it hard to tell what has changed.
First refactor done.
Please the snowflake.test shows an infinity loop. Could you have a look into it? It happens even with a single test case.
Regarding the formatting of : operator. I made a change to make the customization of formatting simpler: #466
One should now be able to simply add this to SnowflakeFormatter class:
formatOptions() {
return { alwaysDenseOperators: [':'] };
}
Also added escaping of keywords, so the $ character in function names should no more pose a problem: #473
I'm sorry, I screwed up the commits by including your commits into this PR. This is why I usually force push, it allows me to cleanly integrate commits to master with rebase to the feature branch and then force push the feature branch.
Merging would've been a better idea.
I will clean up the tests now.
Ok, I reverted your commits in this PR and instead merged master into this branch. Everything fine now. Puhhh...
The only new important commit for you is 8dc6f51
If you want, feel free to delete these accidental commits and force-push the result. You'll just need to:
git rebase -i 152bd48
# delete all the commits besides the 8dc6f51 and the last merge
# or also delete the merge and just redo it.
-
Is there an option to define how identifiers are allowed to look like? In Snowflake they can include any symbol when inside
"", but without quoting they have a certain rule. There is theidentCharsConfig option, but I'm unsure if this refers to quoted or unquoted identifiers. -
Is there an option to test variables?
-
Did I miss any obvious
support...tests?
Is there an option to define how identifiers are allowed to look like?
Yep, there are two options for that:
identTypesdescribes quoted identifiers (with all characters allowed inside quotes).identCharsdescribes unquoted identifiers.
Both of these seem to be configured correctly.
Is there an option to test variables?
No, the variable syntax differs quite a bit between SQL dialects. Have a look at the variable tests in behavesLikeMariaDbFormatter.ts and create some custom tests for Snowflake.
Did I miss any obvious support... tests?
Doesn't seem so at first glance.
I understand your concerns regarding the tests. However in my opinion they have the purpose of explaining the accepted syntax without having to look into the actual source files.
For example with supportsTruncateTable(format, { withoutTable: true, ifExists: true }) we can very easily see all the dialects that support TRUNCATE IF EXISTS, without having to manually look into the source files?
We've 3 options:
- Leave the tests in this PR, add the other languages that support them later (I can do this if you want?)
- Add the tests completly in a later PR
- Remove the tests
Option 3. would decrease test coverage and verbosity for the sake of a few less lines and some milisecons of test time.
You are the maintainer, it's up to you.
OK, let the tests be for now as they are. My main concern was with the SET SCHEMA test. Others are not that big of a deal.
OK. I fixed the : handling. If you now merge in the master, it should work.
Though now you'll also need to add the :: operator to the alwaysDenseOperators config.
Everything done, all tests pass. Is this mergable now? :smiley:
I somehow can't answer your comments, so I'll do it here:
I added to reservedPhrases:
'MATCH {FULL | SIMPLE | PARTIAL}',
'ON {UPDATE | DELETE} {CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION}',
'INITIALLY {DEFERRED | IMMEDIATE}',
This breaks the supportConstraints tests, because we are doing ON UPDATE CURRENT_TIMESTAMP there.
This test is problematic anyway because we are using it for many languages that don`t even support this line (Postgresql for example).
I fixed the test, we can add the correct specifications for other languages later.
Thanks for putting up with all my complaints :)
I'll merge this in and create a beta release out of it.
Thank you for all the fast feedback. It was my first open source contribution and you made the process very easy.
It's now released as 11.0.0-beta.1