opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard
[receiver/httpcheck] emit single `httpcheck.status` datapoint instead of five
Component(s)
receiver/httpcheck
Version
v0.78.0
Is your feature request related to a problem? Please describe.
The HTTP Check receiver currently emits 6 time series per single endpoint. For example, the following configuration:
exporters:
logging:
verbosity: detailed
receivers:
httpcheck:
endpoint: https://opentelemetry.io
service:
pipelines:
metrics:
exporters:
- logging
receivers:
- httpcheck
gives the following output:
$ otelcol-contrib-0.78.0 --config config.yaml
2023-06-01T12:48:14.930+0200 info service/telemetry.go:104 Setting up own telemetry...
2023-06-01T12:48:14.930+0200 info service/telemetry.go:127 Serving Prometheus metrics {"address": ":8888", "level": "Basic"}
2023-06-01T12:48:14.930+0200 info [email protected]/exporter.go:275 Development component. May change in the future. {"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-06-01T12:48:14.930+0200 info [email protected]/receiver.go:296 Development component. May change in the future. {"kind": "receiver", "name": "httpcheck", "data_type": "metrics"}
2023-06-01T12:48:14.954+0200 info service/service.go:131 Starting otelcol-contrib... {"Version": "0.78.0", "NumCPU": 16}
2023-06-01T12:48:14.954+0200 info extensions/extensions.go:30 Starting extensions...
2023-06-01T12:48:14.956+0200 info service/service.go:148 Everything is ready. Begin running and processing data.2023-06-01T12:48:18.905+0200 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 2, "data points": 6}
2023-06-01T12:48:18.905+0200 info ResourceMetrics #0
Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/httpcheckreceiver 0.78.0
Metric #0
Descriptor:
-> Name: httpcheck.duration
-> Description: Measures the duration of the HTTP check.
-> Unit: ms
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 941
Metric #1
Descriptor:
-> Name: httpcheck.status
-> Description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
-> Unit: 1
-> DataType: Sum
-> IsMonotonic: false
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
-> http.status_class: Str(1xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
NumberDataPoints #1
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
-> http.status_class: Str(2xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 1
NumberDataPoints #2
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
-> http.status_class: Str(3xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
NumberDataPoints #3
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
-> http.status_class: Str(4xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
NumberDataPoints #4
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
-> http.status_class: Str(5xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
{"kind": "exporter", "data_type": "metrics", "name": "logging"}
The data points with 0 value do not carry a lot of information. Ideally, I would expect to only have one httpcheck.status data point emitted for an endpoint.
Describe the solution you'd like
I propose to make it possible via a configuration option to only emit non-zero data points:
receivers:
httpcheck:
endpoint: https://opentelemetry.io
emit_zero_values: false # we might need a better name for this configuration property
so that the output would be something like:
$ otelcol-contrib-0.78.0 --config config.yaml
2023-06-01T12:48:14.930+0200 info service/telemetry.go:104 Setting up own telemetry...
2023-06-01T12:48:14.930+0200 info service/telemetry.go:127 Serving Prometheus metrics {"address": ":8888", "level": "Basic"}
2023-06-01T12:48:14.930+0200 info [email protected]/exporter.go:275 Development component. May change in the future. {"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-06-01T12:48:14.930+0200 info [email protected]/receiver.go:296 Development component. May change in the future. {"kind": "receiver", "name": "httpcheck", "data_type": "metrics"}
2023-06-01T12:48:14.954+0200 info service/service.go:131 Starting otelcol-contrib... {"Version": "0.78.0", "NumCPU": 16}
2023-06-01T12:48:14.954+0200 info extensions/extensions.go:30 Starting extensions...
2023-06-01T12:48:14.956+0200 info service/service.go:148 Everything is ready. Begin running and processing data.2023-06-01T12:48:18.905+0200 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 2, "data points": 2}
2023-06-01T12:48:18.905+0200 info ResourceMetrics #0
Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/httpcheckreceiver 0.78.0
Metric #0
Descriptor:
-> Name: httpcheck.duration
-> Description: Measures the duration of the HTTP check.
-> Unit: ms
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 941
Metric #1
Descriptor:
-> Name: httpcheck.status
-> Description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
-> Unit: 1
-> DataType: Sum
-> IsMonotonic: false
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
-> http.status_class: Str(2xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 1
{"kind": "exporter", "data_type": "metrics", "name": "logging"}
I also think this might be a good default for this receiver.
Describe alternatives you've considered
I suppose an alternative would be to add a processor to the pipeline that will filter out the zero data points. But honestly I wasn't able to find a way to filter out metric data points based on their value using either the Filter processor or Metrics Transform processor. Is this possible? :thinking:
Additional context
Telemetry is costly. We don't want to collect metrics that don't carry a lot of value.
Pinging code owners:
- receiver/httpcheck: @codeboten
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
- needs: Github issue template generation code needs this to generate the corresponding labels.
- receiver/httpcheck: @codeboten
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@codeboten can you please take a look? I believe this issue is important, as I couldn't find a way to exclude the zero time series using a processor:
I suppose an alternative would be to add a processor to the pipeline that will filter out the zero data points. But honestly I wasn't able to find a way to filter out metric data points based on their value using either the Filter processor or Metrics Transform processor. Is this possible? 🤔
@astencel-sumo will take a look this week
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
- receiver/httpcheck: @codeboten
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@astencel-sumo will take a look this week
@codeboten is this going to be the week? 😉
@astencel-sumo yes... sorry for the delay!
As discussed in the Dec-13 SIG call, the plan to move this forward is to allow a configuration to filter out the http status class attribute, resulting in a single metric.
Stretch goal is to make this filtering of attribute generic enough to be used in all scrapers 😬
@codeboten can you please take a look? I believe this issue is important, as I couldn't find a way to exclude the zero time series using a processor:
I suppose an alternative would be to add a processor to the pipeline that will filter out the zero data points. But honestly I wasn't able to find a way to filter out metric data points based on their value using either the Filter processor or Metrics Transform processor. Is this possible? 🤔
I found a way to exclude zero values with the Filter processor:
processors:
filter:
metrics:
datapoint:
- 'metric.name == "httpcheck.status" and value_int == 0'
However, this is not really what I want. I do want to get the zero values when the endpoint is down. I just want a single zero datapoint and not five. Let me rephrase the issue title to account for this.
I think this is a workaround that makes a reasonable amount of sense to me:
filter/drop-non-2xx-datapoints:
metrics:
datapoint:
- 'metric.name == "httpcheck.status" and attributes["http.status_class"] != "2xx"'
Here's a full example:
exporters:
debug:
verbosity: detailed
prometheus:
endpoint: localhost:1234
processors:
filter/drop-non-2xx-datapoints:
metrics:
datapoint:
- 'metric.name == "httpcheck.status" and attributes["http.status_class"] != "2xx"'
transform/drop-status-class-attribute:
metric_statements:
- context: datapoint
statements:
- keep_keys(attributes, ["http.url", "http.status_code", "http.method"]) where metric.name == "httpcheck.status"
receivers:
httpcheck:
collection_interval: 3s
targets:
- endpoint: https://opentelemetry.io
- endpoint: https://non.existent.address
service:
pipelines:
metrics:
exporters:
- debug
- prometheus
processors:
- filter/drop-non-2xx-datapoints
- transform/drop-status-class-attribute
receivers:
- httpcheck
Here's the output from the collector:
$ otelcol-contrib-0.89.0-darwin_arm64 --config config.yaml
2023-12-14T10:06:38.819+0100 info [email protected]/telemetry.go:85 Setting up own telemetry...
2023-12-14T10:06:38.819+0100 info [email protected]/telemetry.go:202 Serving Prometheus metrics {"address": ":8888", "level": "Basic"}
2023-12-14T10:06:38.819+0100 info [email protected]/exporter.go:275 Development component. May change in the future. {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2023-12-14T10:06:38.819+0100 info [email protected]/receiver.go:296 Development component. May change in the future. {"kind": "receiver", "name": "httpcheck", "data_type": "metrics"}
2023-12-14T10:06:38.819+0100 info [email protected]/service.go:143 Starting otelcol-contrib... {"Version": "0.89.0", "NumCPU": 10}
2023-12-14T10:06:38.819+0100 info extensions/extensions.go:34 Starting extensions...
2023-12-14T10:06:38.820+0100 info [email protected]/service.go:169 Everything is ready. Begin running and processing data.
2023-12-14T10:06:40.380+0100 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 3, "data points": 5}
2023-12-14T10:06:40.381+0100 info ResourceMetrics #0
Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/httpcheckreceiver 0.89.0
Metric #0
Descriptor:
-> Name: httpcheck.duration
-> Description: Measures the duration of the HTTP check.
-> Unit: ms
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://non.existent.address)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822583 +0000 UTC
Value: 5
NumberDataPoints #1
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822618 +0000 UTC
Value: 557
Metric #1
Descriptor:
-> Name: httpcheck.error
-> Description: Records errors occurring during HTTP check.
-> Unit: {error}
-> DataType: Sum
-> IsMonotonic: false
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://non.existent.address)
-> error.message: Str(Get "https://non.existent.address": dial tcp: lookup non.existent.address: no such host)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822583 +0000 UTC
Value: 1
Metric #2
Descriptor:
-> Name: httpcheck.status
-> Description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
-> Unit: 1
-> DataType: Sum
-> IsMonotonic: false
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> http.url: Str(https://non.existent.address)
-> http.status_code: Int(0)
-> http.method: Str(GET)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822583 +0000 UTC
Value: 0
NumberDataPoints #1
Data point attributes:
-> http.url: Str(https://opentelemetry.io)
-> http.status_code: Int(200)
-> http.method: Str(GET)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822618 +0000 UTC
Value: 1
{"kind": "exporter", "data_type": "metrics", "name": "debug"}
Here's the output from the Prometheus exporter:
$ curl localhost:1234/metrics
# HELP httpcheck_duration_milliseconds Measures the duration of the HTTP check.
# TYPE httpcheck_duration_milliseconds gauge
httpcheck_duration_milliseconds{http_url="https://non.existent.address"} 4
httpcheck_duration_milliseconds{http_url="https://opentelemetry.io"} 176
# HELP httpcheck_error Records errors occurring during HTTP check.
# TYPE httpcheck_error gauge
httpcheck_error{error_message="Get \"https://non.existent.address\": dial tcp: lookup non.existent.address: no such host",http_url="https://non.existent.address"} 1
# HELP httpcheck_status 1 if the check resulted in status_code matching the status_class, otherwise 0.
# TYPE httpcheck_status gauge
httpcheck_status{http_method="GET",http_status_code="0",http_url="https://non.existent.address"} 0
httpcheck_status{http_method="GET",http_status_code="200",http_url="https://opentelemetry.io"} 1
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
- receiver/httpcheck: @codeboten
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
- receiver/httpcheck: @codeboten
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been closed as inactive because it has been stale for 120 days with no activity.