sql-formatter
sql-formatter copied to clipboard
Support Chinese parenthesis characters in MySQL
Input data
SELECT `时间` as "时间",SUM(进度(计划完成率)) as "SUM(`进度(计划完成率)`)" FROM ds_upload_19 WHERE 1=1 GROUP BY `时间` LIMIT 1000
Expected Output
SELECT
`时间` as "时间",
SUM(进 度 ( 计 划 完 成 率 )) as "SUM(`进度(计划完成率)`)"
FROM
ds_upload_19
WHERE
1 = 1
GROUP BY
`时间`
LIMIT
1000
Actual Output
SELECT `时间` as "时间",SUM(进度(计划完成率)) as "SUM(`进度(计划完成率)`)" FROM ds_upload_19 WHERE 1=1 GROUP BY `时间` LIMIT 1000
Usage
- How are you calling / using the library?
- What SQL language(s) does this apply to?
- Which SQL Formatter version are you using?
Are you able to provide more context?
The formatter works as expected. --
starts a line comment in SQL.
This is also demonstrated in how Github syntax-highlights this code (grayed out as a comment).
sorry, it's my fault. The correct SQL is SELECT 时间
as "时间",SUM(进度(计划完成率)) as "SUM(进度(计划完成率)
)" FROM ds_upload_19 WHERE 1=1 GROUP BY 时间
LIMIT 1000, without --
The format result is correct when I use "( " instead of "(", the problem may lies here.
Are you able to provide more context?
There is no more context.
So, I understand the issue is in some sort of Unicode parenthesis character. I don't know what's the role of this character in this language and how it should be treated in SQL, or how the SQL dialect you're using treats it.
To simplify diagnosing the problem, could you rewrite this problematic of SQL of yours with the minimum amount of non-ascii characters.
For context, you haven't mentioned which dialect of SQL are you using. Like MySQL, SQLite, etc?
Simplified SQL: select str(str)from db
This is a type of MySQL without character restrictions and the role of character "(" in Chinese is equivalent to character "(" in English.
Thanks for the explanation @jiangyayu.
I'll need to do some research into how this issue impacts (or doesn't impact) other dialects.
It definitely won't be a simple thing to fix.
A few additional questions, to make sure I get things right:
- if the formatter would replace all these Chinese "(" characters with plain ASCII "(", it probably wouldn't be acceptable, right?
- If one uses the Chinese open-paren character, is it mandatory to close it also with Chinese close-paren character, or can the Chinese/ASCII variants be used interchangably?
For the first question, the answer is right. If the input is Chinese "(" characters and the output turn this characters into "(" which means changed the input, so I think it's not acceptable.
In Chinese, open-paren character and close-paren character should be used in pairs. The grammar is incorrect if only use one of them.