docs icon indicating copy to clipboard operation
docs copied to clipboard

Implement SQL formatter into the docs pipeline

Open qiancai opened this issue 3 years ago • 6 comments

Change Request

  1. Describe what you find is inappropriate or missing in the existing docs.

    It would be better to have SQL code be more readable by default and be stacked as opposed to all one line.

    For example, the line:

    SELECT l_orderkey, MAX(L_COMMENT), MAX(L_SHIPMODE), MAX(L_SHIPINSTRUCT), MAX(L_SHIPDATE), MAX(L_EXTENDEDPRICE) FROM lineitem GROUP BY l_orderkey HAVING SUM(l_quantity) > 314;
    

    Would be more readable as:

    SELECT 
          l_orderkey, 
          MAX(L_COMMENT), 
          MAX(L_SHIPMODE), 
          MAX(L_SHIPINSTRUCT), 
          MAX(L_SHIPDATE), 
          MAX(L_EXTENDEDPRICE) 
    FROM lineitem 
    GROUP BY l_orderkey 
    HAVING SUM(l_quantity) > 314;
    
  2. Describe your suggestion or addition.

    Try and put something like https://sqlfum.pt/ into the docs pipeline to automatically:

    • Fix the case for SQL keywords (e.g. "insert" → "INSERT")
    • Convert to multi-line
  3. Provide some reference materials (such as documents and websites) if you could.

    https://github.com/mjibson/sqlfmt (which is https://wasm.sqlfum.pt/ ) https://sqlfmt.com/ https://www.salvis.com/blog/2020/08/28/formatting-sql-code-blocks-in-markdown-files/

qiancai avatar Mar 27 '23 08:03 qiancai

@qiancai I found that some SQL fragments in markdown files are not well-organized, therefore prettifying is not possible for some formatter tools.

Here are some suggestions, taking this document for example:

  • If you intend to emphasis the structure of each individual SQL sentence, you may: Sepereate the execution results (the character table and "x rows in set", "Query OK, x rows affected") from the entire block Set the results as normal code block (not marked as ```sql).

  • If you intend to show the whole workflow of some SQL sentences, you may: Keep the entire block as ```sql, but some markers shall be made for non-SQL content so that parser tools can identify then ignore them.

P.S. I think this formatter tool looks sweet, it comes with a yarn package and is easy to integrate.

curiousRay avatar Jan 09 '24 16:01 curiousRay

Plus, the user might not want to copy mysql> together with SQL sentences. Is there any possibility that mysql> or MySQL [test]> is rendered purely by typescript, rather than from markdown sourcefile?

curiousRay avatar Jan 09 '24 16:01 curiousRay

Hi @curiousRay Many thanks for the investigation. Yes, as we have a large number of instances that include both SQL statements and their results in one code block, it is hard for the formatter tool to distinguish them.

As for "mysql>", we can directly remove it from the Markdown source file because the website renders the doc according to ```sql instead of "mysql>".

qiancai avatar Jan 10 '24 07:01 qiancai

/cc @dveeden

qiancai avatar Jan 10 '24 07:01 qiancai

@curiousRay @qiancai I've created a proof of concept SQL formatter for markdown based on TiDB: https://github.com/dveeden/tidb-markdown-sql-format

If we can remove the prompt and add it back when rendering that would be good.

I think we should aim to have something like this:

```sql
SELECT 1
```
```
+----------+
| output|
+----------+
| output |
+----------+
```

This has:

  1. No prompt in the SQL code block
  2. The output and statement in separate code blocks
  3. The output code block (resultset or something like Query OK, 0 rows affected (0.11 sec)) not being set to sql

However we currently have many places with:

  1. A prompt (mysql> ) included in the SQL
  2. The statement and resultset or output combined.
  3. SQL code blocks having only output or a resultset (Statement and Result in different code blocks, both with the type set to sql)

If we split the SQL statement and output code block we could even have a formatter that formats the tabular output in a nicely looking table if we want. However I'm not sure if we should do that at some point and I don't think we should do that now.

dveeden avatar Jan 19 '24 10:01 dveeden

P.S. I think this formatter tool looks sweet, it comes with a yarn package and is easy to integrate.

@curiousRay it now also has TiDB support: https://github.com/sql-formatter-org/sql-formatter/tree/master/src/languages/tidb

dveeden avatar Apr 12 '24 07:04 dveeden