client_golang icon indicating copy to clipboard operation
client_golang copied to clipboard

testutil.GatherAndCompare is error-prone

Open machine424 opened this issue 2 years ago • 5 comments

After https://github.com/prometheus/client_golang/pull/1143, the function testutil.GatherAndCompare:

https://github.com/prometheus/client_golang/blob/1bae6c1e6314f6a20be183a7277059630780232a/prometheus/testutil/testutil.go#L197-L203

https://github.com/prometheus/client_golang/blob/1bae6c1e6314f6a20be183a7277059630780232a/prometheus/testutil/testutil.go#L205-L222

https://github.com/prometheus/client_golang/blob/1bae6c1e6314f6a20be183a7277059630780232a/prometheus/testutil/testutil.go#L236-L245

became error-prone, as we filter the expected metrics in compareMetricFamilies as well now, if we make a mistake in one of the arguments expected or metricNames, the function will no longer return an error.

If e.g. we make a mistake in the name of the metric (provide a metric that is not exported because of a typo or re-using a variable containing the wrong metric name), the function will return nil.

Or e.g. if we provide an expected with the wrong metrics to a faulty code, we could never be aware of that.

I think expected lost its safeguard/source of truth utility.

I expect the user/tester to have full control on expected, I don't see why one couldn't provide the exact series they expect.

Plus having "dynamic" expected makes the tests hard to grasp and debug. No one needs tests with too many logic/intelligence that needs to be tested as well :) Way beneficial when they are more explicit even if this result in a little bit more code.

Looking at the code that was justifying the change:

https://github.com/fluxninja/aperture/blob/416d9ceaff9c1b2c4c238210c7c067dcf0a65567/pkg/otelcollector/metricsprocessor/processor_test.go#L148-L168

it's still not making use of the change yet, but I think they can easily split their expected and pass it as a map or sth to their test function...

Discovered this while adding a test for kubernetes etcd metrics: https://github.com/kubernetes/kubernetes/pull/120827. I copied/pasted a test function and the test was passing because I forgot to change the metric name.

machine424 avatar Sep 26 '23 21:09 machine424

cc @dgrisonnet @wojtek-t

machine424 avatar Sep 26 '23 21:09 machine424

cc @kakkoyun @bwplotka

Could you please provide some insights here?

dgrisonnet avatar Sep 28 '23 12:09 dgrisonnet

Thanks for this finding. I think both you and https://github.com/prometheus/client_golang/pull/1143 was right. I think you would agree that before https://github.com/prometheus/client_golang/pull/1143 we had broken implementation as both signature and comments of GatherAndCompare, TransactionalGatherAndCompare and compareMetricFamilies suggest that we compare all metrics OR only filtered one by "metricNames" slice if provided.

What's broken now is that the filtered metric names should be PRESENT and expected.

Acceptance Criteria:

  • If metricNames > 0 AND one of those metric is NOT present in "got" (or expected optionally) error should be returned.
  • Clean up tests - all ScrapeAndCompare, GatherAndCompare etc are scattered and duplicated. Let's focus to test once each functionality and use table tests for different cases.

Help wanted!

bwplotka avatar Oct 10 '23 09:10 bwplotka

Fix for this was reverted: https://github.com/prometheus/client_golang/releases/tag/v1.20.5, so reopening. We plan to deliver long term fix in https://github.com/prometheus/client_golang/issues/1639

bwplotka avatar Oct 15 '24 10:10 bwplotka

Hello 👋 Looks like there was no activity on this issue for the last 3 months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next 4 weeks, this issue will be closed (we can always reopen an issue if we need!).

stale[bot] avatar Jul 19 '25 06:07 stale[bot]