Callum Styan comments

Results 206 comments of


                                            Callum Styan

Metrics: GC stale series separately from truncating WAL

> How strong is the "4 minutes is the maximum recommended scrape interval for any target" statement? IIRC the recommendation is even shorter `This interval should not exceed 2m to...

consider clarify "Out of order sample from remote write" error (remote-write-receiver feature enabled)

I honestly don't know how the helm chart deploys prometheus. If it's deploying an HA pair and you aren't adding external labels to differentiate between each of those prometheus instances...

consider clarify "Out of order sample from remote write" error (remote-write-receiver feature enabled)

The error message's details are all from the receiving end, so we can make an improvement here to the error that Prometheus returns when it scrapes/receives remote write but other...

consider clarify "Out of order sample from remote write" error (remote-write-receiver feature enabled)

@utkuozdemir is this intermittent? on startup? constant? the occasional `out of order` message is not out of the ordinary or necessarily a problem

consider clarify "Out of order sample from remote write" error (remote-write-receiver feature enabled)

> I'm having this same issue with Federated prometheus running in GKE clusters. The main operator prometheus has restarted nearly 60 times since the middle of the night last night....

consider clarify "Out of order sample from remote write" error (remote-write-receiver feature enabled)

@prologic the right path forward is what we refer to as "remote write checkpointing", the ability for remote write to keep track of where within the WAL it has successfully...

Append metadata to the WAL in the scrape loop

As others have mentioned, it would be a good idea to put this behind a feature flag initially. Have a look at `cmd/prometheus/main.go`, specifically the function `setFeatureListOptions` for how we...

Memory usage spikes during WAL replay to more than normal usage

Your Prometheus instance either needs more RAM, or you can delete the WAL directory (you will lose some data). Don't think there's been changes recently that would affect replay memory...

Memory usage spikes during WAL replay to more than normal usage

Hi everyone. I appreciate that it is probably frustrating to be running into this problem and see an open issue for it that hasn't been fully fixed, but multiple +1...

add color logging capabilities for promlog

cc: @fabxc