Invalid Keyword Usage in `PostgreSQL` Dialect
Search before asking
- [X] I searched the issues and found no similar issues.
What Happened
According to the postgresql keywords appendix, some keywords can't be used as function names or types.
However, when running the following command, no error was raised:
test.sql
CREATE TABLE test_table (
type between NOT NULL
);
command
$ sqlfluff parse test.sql --dialect postgres
[L: 1, P: 1] |file:
[L: 1, P: 1] | statement:
[L: 1, P: 1] | create_table_statement:
[L: 1, P: 1] | keyword: 'CREATE'
[L: 1, P: 7] | whitespace: ' '
[L: 1, P: 8] | keyword: 'TABLE'
[L: 1, P: 13] | whitespace: ' '
[L: 1, P: 14] | table_reference:
[L: 1, P: 14] | naked_identifier: 'test_table'
[L: 1, P: 24] | whitespace: ' '
[L: 1, P: 25] | bracketed:
[L: 1, P: 25] | start_bracket: '('
[L: 1, P: 26] | [META] indent:
[L: 1, P: 26] | newline: '\n'
[L: 2, P: 1] | whitespace: ' '
[L: 2, P: 5] | column_reference:
[L: 2, P: 5] | naked_identifier: 'type'
[L: 2, P: 9] | whitespace: ' '
[L: 2, P: 10] | data_type:
[L: 2, P: 10] | data_type_identifier: 'between'
[L: 2, P: 17] | whitespace: ' '
[L: 2, P: 18] | column_constraint_segment:
[L: 2, P: 18] | keyword: 'NOT'
[L: 2, P: 21] | whitespace: ' '
[L: 2, P: 22] | keyword: 'NULL'
[L: 2, P: 26] | newline: '\n'
[L: 3, P: 1] | [META] dedent:
[L: 3, P: 1] | end_bracket: ')'
[L: 3, P: 2] | statement_terminator: ';'
[L: 3, P: 3] | [META] end_of_file:
Expected Behaviour
An unparsable error should be raised
Observed Behaviour
No error was raised
How to reproduce
test.sql
CREATE TABLE test_table (
type between NOT NULL
);
command
$ sqlfluff parse test.sql --dialect postgres
Dialect
postgres
Version
3.2.5
Configuration
None
Are you willing to work on and submit a PR to address the issue?
- [X] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
@alanmcruickshank I'm working on dialect_postgres_keywords and noticed that its structure is a bit different from other dialects. This difference may be due to the PostgreSQL extension keywords.
In addition, I'm considering if not-keyword terms are necessary in postgres_keywords, since they may be keywords in ANSI but not in PostgreSQL. I think it might be better to define postgres_reserved_keywords and postgres_non_reserved_keywords directly, rather than using tuples, which are only used in postgres_keywords and not in other dialects.
For extension keywords, I thought of adding separate lists like pgvector_keywords or postgis_keywords, as these are all non-reserved.
Could you provide any advice on this approach?
Hey @R3gardless . I wrote that keyword set initially. The idea was that I could track what was in the docs, and override what the docs said based on what was actually true in the language. In retrospect, it wasn't a great idea, and it just causes the keywords to look different in postgres to every other dialect. Feel free to rewrite it if you want to 👍
Thank you for the advice, @WittierDinosaur!
I referred to the official PostgreSQL source code (gram.y) and noticed that keywords are divided into five categories: unreserved, col_name, type_func_name, reserved, and bare_label.
I’m considering following the official parser rules to align with the source code. This approach could necessitate some modifications to the overall postgres_dialect implementation.
I’d appreciate your thoughts on this idea.
Thanks again!