dataform
dataform copied to clipboard
dataform cli format issue when columns contain "--"
config {
type: "table",
}
SELECT
source_data_json.timestamp,
source_data_json.`TEMPERATURE--BODY`,
source_data_json.`ENERGY_ACTIVE_IMPORT_REGISTER--BODY`,
FROM
`XXX.YYY.ZZZ`
is formatted as
config {
type: "table",
}
SELECT
source_data_json.timestamp,
source_data_json.`TEMPERATURE
--BODY`,
source_data_json.` ENERGY_ACTIVE_IMPORT_REGISTER
--BODY`,
FROM
`XXX.YYY.ZZZ`
expected:
config {
type: "table",
}
SELECT
source_data_json.timestamp,
source_data_json.`TEMPERATURE--BODY`,
source_data_json.`ENERGY_ACTIVE_IMPORT_REGISTER--BODY`,
FROM
`XXX.YYY.ZZZ`
dataform --version returns 2.9.0
@Ekrekr test that this is still reproducible after https://github.com/dataform-co/dataform/pull/1741 is merged.
Issue is not fixed. I think this is caused by the lexing, where we still treat inner SQL literal strings as comments even when we shouldn't be https://github.com/dataform-co/dataform/blob/2531b120c3869dbbc6efb4ecafbcfa9edb99c738/sqlx/lexer.ts#L374
Another maybe linked error ?
Executed with dataform cli version : 3.0.2
Example: definitions/test.sqlx
config { type: "view"}
WITH int_table AS (
SELECT id
FROM `my_dataset.my_table`) -- test
SELECT id
FROM int_table
What we get
When running dataform format --actions="definitions/test.sqlx", an error is returned :
Errors encountered during formatting: definitions/test.sqlx: Formatter unable to determine final formatted form.
While we expect:
config {
type: "view"
}
WITH
int_table AS (
SELECT
id
FROM
`my_dataset.my_table`) -- test
SELECT
id
FROM
int_table
Hello, have found potentially a new bug with dataform formatter.
It seems to be struggling with any configs that contain a pre-operation. Example:
config {
type: "view"
}
pre_operations {
declare day_start date;
set day_start = current_date()
}
WITH int_table AS (
SELECT id
FROM `my_dataset.my_table` where dt = day_start) -- test
SELECT id
FROM int_table
when running the formatter the returned output is:
config {
type: "view"
}
WITH
int_table AS (
SELECT
id
FROM
`my_dataset.my_table`
WHERE
dt = day_start) -- test
SELECT
id
FROM
int_tab
Furthermore, when running more complex pre-operations on incrementals, we have another problem of the formatter failing e.g. when running the formatter on the following query
config {
type: "incremental"
}
pre_operations {
declare day_start, day_end date;
set (day_start, day_end) = (
${when(incremental(),
`
select as struct
cast('${constants.startDate}' as date) as day_start,
date_add(max(date(start_tstamp)), ${constants.backfillInterval}) as day_end
from ${self()}
`,
`
select as struct
cast('${constants.startDate}' as date) as day_start,
date_add(cast('${constants.startDate}' as date), ${constants.backfillInterval}) as day_end
`
)}
)
;
}
WITH int_table AS (
SELECT id
FROM `my_dataset.my_table` where dt > day_start and dt < day_end) -- test
SELECT id
FROM int_table
We just receive the error Errors encountered during formatting: definitions/test.sqlx: Formatter unable to determine final formatted form.
Running into something like this as well with CASE statements. Wanted to write up what I'm seeing in case helpful.
$ dataform --version
3.0.19
Minimal reproduction case:
SELECT
CASE
WHEN id = 1 THEN 'one' -- this comment breaks it
WHEN id = 2 THEN 'two'
ELSE
'other'
END
AS identifier
FROM
some_table
With the comment on line 3, running dataform format will give:
Errors encountered during formatting:
definitions/test.sqlx: Formatter unable to determine final formatted form.
Without that comment, or when the comment is on its own line, dataform format works.