node-sql-parser icon indicating copy to clipboard operation
node-sql-parser copied to clipboard

Preserve the original identifier quoting

Open lhpz opened this issue 2 years ago • 1 comments

Describe the bug Preserve the quoting (or absence of quoting) of identifiers in the original query

Database Engine PostgreSQL.

Context In Postgres, unquoted identifiers are case insensitive. So SELECT mYcoL FROM tEsT and SELECT mycol FROM test are equivalent queries. However, quoted identifiers are case sensitive. So SELECT "MyCol" FROM test is semantically different from the two previous queries.

Since the sqlify() function will always quote identifiers, it will produce SQL that is semantically different from the original, when the original used unquoted identifiers.

To Reproduce

const p = new Parser();
const ast = p.astify(`SELECT MyCol FROM "test"`, { database: 'Postgresql' });
const sql = p.sqlify(ast, { database: 'Postgresql' });
console.log(sql);

node-sql-parser version 4.11.0

Observed output SELECT "MyCol" FROM "test". "MyCol" is rebuilt with quotes, when the original SQL was unquoted.

Expected output SELECT MyCol FROM "test", the original quoting style is preserved for the two identifiers MyCol (unquoted) and "test" (quoted).

The solution is to track if the original identifier was quoted or not in the AST, and rebuild the SQL based on this additional information. Perhaps an additional quoted property could be added to the AST for a more detailed description of identifiers:

{
    "type": "column_ref",
    "table": null,
    "column": { 
         "value": "MyCol", 
         "quoted": true 
    }
}

Of course this AST modification will break all the code that expects a string value in the column key. It should probably be considered only in a major release, or sooner with a different AST change that will be backward compatible.

lhpz avatar Oct 03 '23 17:10 lhpz

Hello, the issue still appears in big query.

Can be reproduced this way

 it.only('should preserve bg quotes', () => {
    const parser = new Parser();
    const query = 'SELECT `a` FROM `b`';
    const parsed = parser.astify(query, { database: 'bigquery' });
    const sql = parser.sqlify(parsed, { database: 'bigquery' });
    expect(sql).toEqual(query);
  });

Removing the quotes can break the queries due to either the case sensitiveness, or the fact that a quoted identifier may have dashes (while unquoted cannot). See more here https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical

quadristan avatar Dec 10 '24 16:12 quadristan