action-sqlfluff icon indicating copy to clipboard operation
action-sqlfluff copied to clipboard

SQLFluff parse error in lint mode

Open kieronellis opened this issue 3 years ago • 1 comments

@yu-iskw See https://github.com/yu-iskw/action-sqlfluff/issues/31

kieronellis avatar Aug 29 '22 13:08 kieronellis

The issue seems to be related to this where more recent versions of dbt output to stdout messages like Partial parse save file not found. Starting full parse. which get inadvertently written to the SQLFluff output. So the result of sqlfluff lint becomes e.g.

[0m12:43:09  Partial parse save file not found. Starting full parse.
[{"filepath": "models/adyen/fact_adyen_dispute_transaction_report.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 15, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}]}, {"filepath": "models/adyen/fact_adyen_payment_accounting.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 14, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}, {"line_no": 16, "line_pos": 34, "code": "L009", "description": "Files must end with a single trailing newline."}]}]

instead of

[{"filepath": "models/adyen/fact_adyen_dispute_transaction_report.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 15, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}]}, {"filepath": "models/adyen/fact_adyen_payment_accounting.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 14, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}, {"line_no": 16, "line_pos": 34, "code": "L009", "description": "Files must end with a single trailing newline."}]}]

My workaround has been to pipe the results of the sqlfluff lint to grep '^\[' to exclude any unwanted rows added to the output by dbt, e.g.


# Lint changed files if the mode is lint
if [[ "${SQLFLUFF_COMMAND:?}" == "lint" ]]; then
  echo '::group:: Running sqlfluff 🐶 ...'
  # Allow failures now, as reviewdog handles them
  set +Eeuo pipefail
  lint_results="sqlfluff-lint.json"
  # shellcheck disable=SC2086,SC2046
  sqlfluff lint \
      --format json \
      $(if [[ "x${SQLFLUFF_CONFIG}" != "x" ]]; then echo "--config ${SQLFLUFF_CONFIG}"; fi) \
      $(if [[ "x${SQLFLUFF_DIALECT}" != "x" ]]; then echo "--dialect ${SQLFLUFF_DIALECT}"; fi) \
      $(if [[ "x${SQLFLUFF_PROCESSES}" != "x" ]]; then echo "--processes ${SQLFLUFF_PROCESSES}"; fi) \
      $(if [[ "x${SQLFLUFF_RULES}" != "x" ]]; then echo "--rules ${SQLFLUFF_RULES}"; fi) \
      $(if [[ "x${SQLFLUFF_EXCLUDE_RULES}" != "x" ]]; then echo "--exclude-rules ${SQLFLUFF_EXCLUDE_RULES}"; fi) \
      $(if [[ "x${SQLFLUFF_TEMPLATER}" != "x" ]]; then echo "--templater ${SQLFLUFF_TEMPLATER}"; fi) \
      $(if [[ "x${SQLFLUFF_DISABLE_NOQA}" != "x" ]]; then echo "--disable-noqa ${SQLFLUFF_DISABLE_NOQA}"; fi) \
      $changed_files |
    grep '^\[' \
    >> "$lint_results"
  sqlfluff_exit_code="${PIPESTATUS[0]}"
  echo "sqlfluff_exit_code = $sqlfluff_exit_code"

  echo "echo and cat lint_results start"
  echo "echo lint_results = $lint_results"
  cat "$lint_results"
  echo "echo and cat lint_results end"

  echo "::set-output name=sqlfluff-results::$(cat <"$lint_results" | jq -r -c '.')" # Convert to a single line
  echo "::set-output name=sqlfluff-exit-code::${sqlfluff_exit_code}"

  set -Eeuo pipefail
  echo '::endgroup::'

  echo '::group:: Running reviewdog 🐶 ...'
  # Allow failures now, as reviewdog handles them
  set +Eeuo pipefail

  lint_results_rdjson="sqlfluff-lint.rdjson"
  cat "$lint_results" |
    jq -r -f "${SCRIPT_DIR}/to-rdjson.jq" \
    >> "$lint_results_rdjson"
  echo "echo and cat lint_results_rdjson start"
  echo "echo lint_results_rdjson = $lint_results_rdjson"
  cat "$lint_results_rdjson"
  cat "$lint_results"
  echo "echo and cat lint_results_rdjson end"

  cat <"$lint_results_rdjson" |
    reviewdog -f=rdjson \
      -name="sqlfluff-lint" \
      -reporter="${REVIEWDOG_REPORTER}" \
      -filter-mode="${REVIEWDOG_FILTER_MODE}" \
      -fail-on-error="${REVIEWDOG_FAIL_ON_ERROR}" \
      -level="${REVIEWDOG_LEVEL}"
  reviewdog_return_code="${PIPESTATUS[1]}"

  echo "::set-output name=sqlfluff-results-rdjson::$(cat <"$lint_results_rdjson" | jq -r -c '.')" # Convert to a single line
  echo "::set-output name=reviewdog-return-code::${reviewdog_return_code}"

  set -Eeuo pipefail
  echo '::endgroup::'

  exit $sqlfluff_exit_code

Possibly these problems have gone unnoticed by some because this GitHub Action's tests utilize dbt-bigquery==1.0.0 which was released in December 2021, and these stdout messages may have only started happening with a more recent dbt-bigquery release.

kieronellis avatar Sep 02 '22 15:09 kieronellis

Seeing this as well and was also experiencing it with running SQLFluff lint directly.

So was doing a similar grep before piping into the annotations.json

rossserven avatar Oct 13 '22 02:10 rossserven

a better way to resolve (i.e. remove) Partial parse save file not found. Starting full parse. is to set partial_parse: False in the profiles.yml file.

We're using a dummy profiles.yml being on dbt Cloud, but having:

config:
    partial_parse: False

allows sqlfluff_command: "lint" to run without issue.

rossserven avatar Oct 28 '22 02:10 rossserven