elementary icon indicating copy to clipboard operation
elementary copied to clipboard

Failure in column_anomalies test when column_timestamp isn't provided

Open angeml opened this issue 1 year ago • 3 comments

Describe the bug This line of code is causing failures in Databricks for the column anomalies test when a column_timestamp is missing.

Caused by: org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `last_session_start_ts` cannot be resolved.  SQLSTATE: 42703; line 30 pos 19
	at org.apache.spark.sql.catalyst.ExtendedAnalysisException.copyPlan(ExtendedAnalysisException.scala:91)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$execute$1(SparkExecuteStatementOperation.scala:688)

I've resolved this locally by adding a timestamp_column but others might not have that option.

To Reproduce

  1. Create a column_anomalies test for a model that doesn't have a timestamp_column
  2. Run test on Databricks
  3. Observe that the extra , at the end of start_bucket_in_data causes an issue with Databricks

Expected behavior This test should not produce an error.

Screenshots Screenshot 2024-06-13 at 12 46 17 PM

Screenshot 2024-06-13 at 12 47 01 PM

Environment (please complete the following information):

  • Elementary CLI (edr) version: 0.15.1, can be found by running pip show elementary-data
  • Elementary dbt package version: 0.15.2, can be found in packages.yml file
  • dbt version you're using 1.7.1
  • Data warehouse : Databricks
  • Infrastructure details prod

Additional context Slack - https://elementary-community.slack.com/archives/C02CTC89LAX/p1716306300184349

Would you be willing to contribute a fix for this issue? For sure 👍 But I think it just needs a comma removal 😄

angeml avatar Jun 13 '24 16:06 angeml

Just realized that I likely should have created this issue in the https://github.com/elementary-data/dbt-data-reliability repo

angeml avatar Jun 14 '24 14:06 angeml

Hi @angeml ! Thanks for opening this issue and sorry for the delayed response. Yes you are absolutely right, it seems this flow was broken and we actually have a PR that fixes it which should be merged in the near future.

haritamar avatar Jun 25 '24 16:06 haritamar

Hi guys! I've been experiencing the same issue in column_anomalies test running in Trino, so looking foward to this solution image

Larissa-Rocha avatar Jun 27 '24 16:06 Larissa-Rocha