dataform icon indicating copy to clipboard operation
dataform copied to clipboard

dataform cli format issue when columns contain "--"

Open kevinzhou-izivia opened this issue 1 year ago • 5 comments
trafficstars

config {
  type: "table",
}

SELECT
  source_data_json.timestamp,
  source_data_json.`TEMPERATURE--BODY`,
  source_data_json.`ENERGY_ACTIVE_IMPORT_REGISTER--BODY`,

   FROM
  `XXX.YYY.ZZZ`

is formatted as

config {
  type: "table",
}

SELECT
  source_data_json.timestamp,
  source_data_json.`TEMPERATURE
  --BODY`,
  source_data_json.` ENERGY_ACTIVE_IMPORT_REGISTER
  --BODY`,
FROM
  `XXX.YYY.ZZZ`

expected:

config {
  type: "table",
}

SELECT
  source_data_json.timestamp,
  source_data_json.`TEMPERATURE--BODY`,
  source_data_json.`ENERGY_ACTIVE_IMPORT_REGISTER--BODY`,
FROM
  `XXX.YYY.ZZZ`

dataform --version returns 2.9.0

kevinzhou-izivia avatar May 23 '24 09:05 kevinzhou-izivia

@Ekrekr test that this is still reproducible after https://github.com/dataform-co/dataform/pull/1741 is merged.

Ekrekr avatar May 23 '24 13:05 Ekrekr

Issue is not fixed. I think this is caused by the lexing, where we still treat inner SQL literal strings as comments even when we shouldn't be https://github.com/dataform-co/dataform/blob/2531b120c3869dbbc6efb4ecafbcfa9edb99c738/sqlx/lexer.ts#L374

Ekrekr avatar Aug 16 '24 11:08 Ekrekr

Another maybe linked error ? Executed with dataform cli version : 3.0.2

Example: definitions/test.sqlx

config { type: "view"} 
WITH int_table AS (
SELECT id
FROM `my_dataset.my_table`) -- test
SELECT id
FROM int_table

What we get When running dataform format --actions="definitions/test.sqlx", an error is returned : Errors encountered during formatting: definitions/test.sqlx: Formatter unable to determine final formatted form.

While we expect:

config {
  type: "view"
}

WITH
  int_table AS (
  SELECT
    id
  FROM
    `my_dataset.my_table`) -- test
SELECT
  id
FROM
  int_table

kevin-zhou-dev avatar Sep 19 '24 10:09 kevin-zhou-dev

Hello, have found potentially a new bug with dataform formatter.

It seems to be struggling with any configs that contain a pre-operation. Example:

config {
  type: "view"
}

pre_operations {
declare day_start date;
set day_start = current_date()
}

WITH int_table AS (
SELECT id
FROM `my_dataset.my_table` where dt = day_start) -- test
SELECT id
FROM int_table

when running the formatter the returned output is:

config {
  type: "view"
}

WITH
  int_table AS (
  SELECT
    id
  FROM
    `my_dataset.my_table`
  WHERE
     dt = day_start) -- test
SELECT
  id
FROM
  int_tab

Furthermore, when running more complex pre-operations on incrementals, we have another problem of the formatter failing e.g. when running the formatter on the following query

config {
  type: "incremental"
}
  pre_operations {
    declare day_start, day_end date;
    set (day_start, day_end) = (
      ${when(incremental(),
        `
        select as struct
          cast('${constants.startDate}' as date) as day_start,
          date_add(max(date(start_tstamp)), ${constants.backfillInterval}) as day_end
        from ${self()}
        `,
        `
        select as struct
          cast('${constants.startDate}' as date) as day_start,
          date_add(cast('${constants.startDate}' as date), ${constants.backfillInterval}) as day_end
        `
      )}
    )
    ;
  }
WITH int_table AS (
SELECT id
FROM `my_dataset.my_table` where dt > day_start and dt < day_end) -- test
SELECT id
FROM int_table

We just receive the error Errors encountered during formatting: definitions/test.sqlx: Formatter unable to determine final formatted form.

CTSebClarkson avatar Mar 28 '25 12:03 CTSebClarkson

Running into something like this as well with CASE statements. Wanted to write up what I'm seeing in case helpful.

$ dataform --version
3.0.19

Minimal reproduction case:

SELECT
  CASE
    WHEN id = 1 THEN 'one' -- this comment breaks it
    WHEN id = 2 THEN 'two'
  ELSE
  'other'
END
  AS identifier
FROM
  some_table

With the comment on line 3, running dataform format will give:

Errors encountered during formatting:
  definitions/test.sqlx: Formatter unable to determine final formatted form.

Without that comment, or when the comment is on its own line, dataform format works.

panozzaj avatar May 14 '25 00:05 panozzaj