sql_formatter icon indicating copy to clipboard operation
sql_formatter copied to clipboard

Formatting of CTEs

Open sambrilleman opened this issue 4 years ago • 4 comments
trafficstars

Is your feature request related to a problem? Please describe

Thanks for this package! Is it possible to have the start of a CTE definition on a new line?

The current behaviour is that a CTE is a continuation of the same line, as in the following example:

print(
    sql_formatter.core.format_sql("""
    WITH
    
    first_cte AS (
        SELECT var 
        FROM table1
    ),
    
    second_cte AS (
        SELECT var 
        FROM table2
    )
    
    SELECT a.var
    FROM first_cte
    JOIN second_cte USING (var)
    """)
)

which leads to the following output:

with first_cte as (SELECT var
                   FROM   table1), second_cte as (SELECT var
                               FROM   table2)
SELECT a.var
FROM   first_cte join second_cte using (var)

Describe the solution you'd like

The desired output would be:

WITH

first_cte as (
    SELECT var
    FROM   table1
), 

second_cte as (
    SELECT var
    FROM   table2
)

SELECT a.var
FROM   first_cte join second_cte using (var)

That is:

  • the WITH would be capitalised on its own line
  • the name of the CTE would be on a new line (preferably with an empty line above)
  • the SELECT statement inside the definition of the CTE would start on a new line, idented to represent a block nested within the parentheses ()

Describe alternatives you've considered

sambrilleman avatar Apr 30 '21 02:04 sambrilleman

  • I had a CTE inside CROSS JOIN LATERAL and right now it gets formatted as
    ...
    CROSS JOIN LATERAL ( WITH bar AS (
            SELECT 
                a,
                b,
            FROM foo_func(baz)),
            barf AS (
                SELECT ...
)
           SELECT ...
    ...

i.e. not only is the WITH hanging there on the right, but also the closing parenthesis of the 2nd CTE placed in column 1, even though contained within other parentheses and thoroughly indented otherwise...

ztane avatar Aug 31 '22 20:08 ztane

I would also like to see this improved! The current version of the package makes my WITH statements unreadable

zachmayer avatar Sep 28 '22 14:09 zachmayer

I'd also like to see WITH formatting added

ryanhaarmann avatar Mar 07 '23 20:03 ryanhaarmann

Thanks so much for creating this tool.

I have a CTE that looks like

set variable notationtime_start = '2024-11-04'; set variable notationtime_end = '2024-11-07'; set variable time_start = '2025-01-01 00:00:00'; set variable time_end = '2026-01-01 00:00:00'; set variable variable = 'mean'; set variable key_parent = 'portfolio_calc.all.cashflow_q';
with filter_data AS (
    SELECT ts.model, ts.notationtime, ts.job_id, m.metric, ts.key, date_part('year', time) as year, ts.value, 
    FROM   metrics m
        INNER JOIN timeseries_results_view ts
            ON m.freetext = ts.freetext and
               m.key = ts.key
    WHERE
            variable = getvariable('variable')
            and notationtime >= getvariable('notationtime_start')
            and notationtime <= getvariable('notationtime_end')
            and time >= getvariable('time_start')
            and time < getvariable('time_end')
    ORDER BY ts.freetext, m.metric desc, ts.key limit 10000
),
pivot_table AS (
    PIVOT filter_data ON metric USING sum(value) as value 
)

select * from pivot_table

and after formatting looks like

set variable notationtime_start = '2024-11-04'; set variable notationtime_end = '2024-11-07'; set variable time_start = '2025-01-01 00:00:00'; set variable time_end = '2026-01-01 00:00:00'; set variable variable = 'mean'; set variable key_parent = 'portfolio_calc.all.cashflow_q'; with filter_data as (SELECT ts.model,
                                                                                                                                                                                                                                                                                                                     ts.notationtime,
                                                                                                                                                                                                                                                                                                                     ts.job_id,
                                                                                                                                                                                                                                                                                                                     m.metric,
                                                                                                                                                                                                                                                                                                                     ts.key,
                                                                                                                                                                                                                                                                                                                     date_part('year', time) as year,
                                                                                                                                                                                                                                                                                                                     ts.value
                                                                                                                                                                                                                                                                                                              FROM   metrics m
                                                                                                                                                                                                                                                                                                                  INNER JOIN timeseries_results_view ts
                                                                                                                                                                                                                                                                                                                      ON m.freetext = ts.freetext and
                                                                                                                                                                                                                                                                                                                         m.key = ts.key
                                                                                                                                                                                                                                                                                                              WHERE  variable = getvariable('variable')
                                                                                                                                                                                                                                                                                                                 and notationtime >= getvariable('notationtime_start')
                                                                                                                                                                                                                                                                                                                 and notationtime <= getvariable('notationtime_end')
                                                                                                                                                                                                                                                                                                                 and time >= getvariable('time_start')
                                                                                                                                                                                                                                                                                                                 and time < getvariable('time_end')
                                                                                                                                                                                                                                                                                                              ORDER BY ts.freetext, m.metric desc, ts.key limit 10000), pivot_table as (pivot filter_data
        ON metric using sum(value) as value)
SELECT *
FROM   pivot_table

This makes the CTE highly unreadable.

MarcSkovMadsen avatar Nov 07 '24 16:11 MarcSkovMadsen