action-sqlfluff
action-sqlfluff copied to clipboard
SQLFluff parse error in lint mode
@yu-iskw See https://github.com/yu-iskw/action-sqlfluff/issues/31
The issue seems to be related to this where more recent versions of dbt output to stdout messages like Partial parse save file not found. Starting full parse. which get inadvertently written to the SQLFluff output. So the result of sqlfluff lint becomes e.g.
[0m12:43:09 Partial parse save file not found. Starting full parse.
[{"filepath": "models/adyen/fact_adyen_dispute_transaction_report.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 15, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}]}, {"filepath": "models/adyen/fact_adyen_payment_accounting.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 14, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}, {"line_no": 16, "line_pos": 34, "code": "L009", "description": "Files must end with a single trailing newline."}]}]
instead of
[{"filepath": "models/adyen/fact_adyen_dispute_transaction_report.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 15, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}]}, {"filepath": "models/adyen/fact_adyen_payment_accounting.sql", "violations": [{"line_no": 13, "line_pos": 1, "code": "L036", "description": "Select targets should be on a new line unless there is only one select target."}, {"line_no": 13, "line_pos": 1, "code": "L044", "description": "Query produces an unknown number of result columns."}, {"line_no": 13, "line_pos": 7, "code": "L001", "description": "Unnecessary trailing whitespace."}, {"line_no": 14, "line_pos": 1, "code": "L003", "description": "Expected 1 indentation, found 0 [compared to line 03]"}, {"line_no": 16, "line_pos": 34, "code": "L009", "description": "Files must end with a single trailing newline."}]}]
My workaround has been to pipe the results of the sqlfluff lint to grep '^\[' to exclude any unwanted rows added to the output by dbt, e.g.
# Lint changed files if the mode is lint
if [[ "${SQLFLUFF_COMMAND:?}" == "lint" ]]; then
echo '::group:: Running sqlfluff 🐶 ...'
# Allow failures now, as reviewdog handles them
set +Eeuo pipefail
lint_results="sqlfluff-lint.json"
# shellcheck disable=SC2086,SC2046
sqlfluff lint \
--format json \
$(if [[ "x${SQLFLUFF_CONFIG}" != "x" ]]; then echo "--config ${SQLFLUFF_CONFIG}"; fi) \
$(if [[ "x${SQLFLUFF_DIALECT}" != "x" ]]; then echo "--dialect ${SQLFLUFF_DIALECT}"; fi) \
$(if [[ "x${SQLFLUFF_PROCESSES}" != "x" ]]; then echo "--processes ${SQLFLUFF_PROCESSES}"; fi) \
$(if [[ "x${SQLFLUFF_RULES}" != "x" ]]; then echo "--rules ${SQLFLUFF_RULES}"; fi) \
$(if [[ "x${SQLFLUFF_EXCLUDE_RULES}" != "x" ]]; then echo "--exclude-rules ${SQLFLUFF_EXCLUDE_RULES}"; fi) \
$(if [[ "x${SQLFLUFF_TEMPLATER}" != "x" ]]; then echo "--templater ${SQLFLUFF_TEMPLATER}"; fi) \
$(if [[ "x${SQLFLUFF_DISABLE_NOQA}" != "x" ]]; then echo "--disable-noqa ${SQLFLUFF_DISABLE_NOQA}"; fi) \
$changed_files |
grep '^\[' \
>> "$lint_results"
sqlfluff_exit_code="${PIPESTATUS[0]}"
echo "sqlfluff_exit_code = $sqlfluff_exit_code"
echo "echo and cat lint_results start"
echo "echo lint_results = $lint_results"
cat "$lint_results"
echo "echo and cat lint_results end"
echo "::set-output name=sqlfluff-results::$(cat <"$lint_results" | jq -r -c '.')" # Convert to a single line
echo "::set-output name=sqlfluff-exit-code::${sqlfluff_exit_code}"
set -Eeuo pipefail
echo '::endgroup::'
echo '::group:: Running reviewdog 🐶 ...'
# Allow failures now, as reviewdog handles them
set +Eeuo pipefail
lint_results_rdjson="sqlfluff-lint.rdjson"
cat "$lint_results" |
jq -r -f "${SCRIPT_DIR}/to-rdjson.jq" \
>> "$lint_results_rdjson"
echo "echo and cat lint_results_rdjson start"
echo "echo lint_results_rdjson = $lint_results_rdjson"
cat "$lint_results_rdjson"
cat "$lint_results"
echo "echo and cat lint_results_rdjson end"
cat <"$lint_results_rdjson" |
reviewdog -f=rdjson \
-name="sqlfluff-lint" \
-reporter="${REVIEWDOG_REPORTER}" \
-filter-mode="${REVIEWDOG_FILTER_MODE}" \
-fail-on-error="${REVIEWDOG_FAIL_ON_ERROR}" \
-level="${REVIEWDOG_LEVEL}"
reviewdog_return_code="${PIPESTATUS[1]}"
echo "::set-output name=sqlfluff-results-rdjson::$(cat <"$lint_results_rdjson" | jq -r -c '.')" # Convert to a single line
echo "::set-output name=reviewdog-return-code::${reviewdog_return_code}"
set -Eeuo pipefail
echo '::endgroup::'
exit $sqlfluff_exit_code
Possibly these problems have gone unnoticed by some because this GitHub Action's tests utilize dbt-bigquery==1.0.0 which was released in December 2021, and these stdout messages may have only started happening with a more recent dbt-bigquery release.
Seeing this as well and was also experiencing it with running SQLFluff lint directly.
So was doing a similar grep before piping into the annotations.json
a better way to resolve (i.e. remove) Partial parse save file not found. Starting full parse. is to set partial_parse: False in the profiles.yml file.
We're using a dummy profiles.yml being on dbt Cloud, but having:
config:
partial_parse: False
allows sqlfluff_command: "lint" to run without issue.