airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Statsd metric name components are unsanitized

Open waldoppper opened this issue 2 years ago • 2 comments

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

[2.2.5] I recently enabled statsd metric emission on my team's deploy and began writing views on the result.

I've found that:

  1. some dag authors had included a period in their task_id name - like task_id = "load.all"
  2. this string is not sanitized of statsd-sensitive characters (like '.' and '|') prior to the building of the Metric name

Result: metric names with too many .-delimiters and grafana views which don't render these task metrics.

What you think should happen instead

Any instance of Statsd.foo(f'{dag_id}...') or Statsd.foo(f'{task_id}...') should probably first sanitize those strings of statsd-sensitive characters.

So instead of

Current: Statsd.timer("dag.{self.task.dag_id}.{self.task.task_id}.duration") -> "dag.DAG.WITH.PERIODS.IN_NAME.TASK.WITH.PERIODS.IN.NAME.duration"

Recommended: Stats.timer(f"dag.{statsd_sanitize(self.task.dag_id)}.{statsd_sanitize(self.task.task_id)}.duration") -> "dag.DAG_WITH_PERIODS_IN_NAME.TASK_WITH_PERIODS_IN_NAME.duration"

How to reproduce

  1. create a dag
  2. write a task in that dag with task_id which contains a period. Like: "foo.bar"
  3. enable statsd metrics collection
  4. run your dag
  5. struggle to deal with poorly-formatted metric names like "dag.DAG.WITH.PERIODS.IN.NAME.TASK.WITH.PERIODS.IN.NAME.duration"

Operating System

debian

Versions of Apache Airflow Providers

apache-airflow-providers-apache-hive==2.3.1
apache-airflow-providers-apache-spark==2.1.2
apache-airflow-providers-celery==2.1.3
apache-airflow-providers-cncf-kubernetes==3.0.0
apache-airflow-providers-docker==2.5.2
apache-airflow-providers-elasticsearch==2.2.0
apache-airflow-providers-ftp==2.1.2
apache-airflow-providers-google==6.7.0
apache-airflow-providers-grpc==2.0.4
apache-airflow-providers-hashicorp==2.1.4
apache-airflow-providers-http==2.1.2
apache-airflow-providers-imap==2.2.3
apache-airflow-providers-microsoft-azure==3.7.2
apache-airflow-providers-mysql==2.2.3
apache-airflow-providers-odbc==2.0.4
apache-airflow-providers-postgres==4.1.0
apache-airflow-providers-redis==2.0.4
apache-airflow-providers-sendgrid==2.0.4
apache-airflow-providers-sftp==2.5.2
apache-airflow-providers-slack==4.2.3
apache-airflow-providers-sqlite==2.1.3
apache-airflow-providers-ssh==2.4.3

Deployment

Official Apache Airflow Helm Chart

Deployment details

Deployed with v8.6.1 of https://github.com/airflow-helm/charts/tree/main/charts/airflow

We wired up a pod to convert statsd->prom using https://github.com/prometheus/statsd_exporter, and use Matcher rules by glob/regex to convert statsd-style metrics into prom-style.

Anything else

No response

Are you willing to submit PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

waldoppper avatar Jan 31 '23 16:01 waldoppper

Thanks for opening your first issue here! Be sure to follow the issue template!

boring-cyborg[bot] avatar Jan 31 '23 16:01 boring-cyborg[bot]

This issue has been automatically marked as stale because it has been open for 365 days without any activity. There has been several Airflow releases since last activity on this issue. Kindly asking to recheck the report against latest Airflow version and let us know if the issue is reproducible. The issue will be closed in next 30 days if no further activity occurs from the issue author.

github-actions[bot] avatar Feb 20 '24 07:02 github-actions[bot]

This issue has been closed because it has not received response from the issue author.

github-actions[bot] avatar Mar 21 '24 07:03 github-actions[bot]