Callum Styan
                                            Callum Styan
                                        
                                    > How strong is the "4 minutes is the maximum recommended scrape interval for any target" statement? IIRC the recommendation is even shorter `This interval should not exceed 2m to...
I honestly don't know how the helm chart deploys prometheus. If it's deploying an HA pair and you aren't adding external labels to differentiate between each of those prometheus instances...
The error message's details are all from the receiving end, so we can make an improvement here to the error that Prometheus returns when it scrapes/receives remote write but other...
@utkuozdemir is this intermittent? on startup? constant? the occasional `out of order` message is not out of the ordinary or necessarily a problem
> I'm having this same issue with Federated prometheus running in GKE clusters. The main operator prometheus has restarted nearly 60 times since the middle of the night last night....
@prologic the right path forward is what we refer to as "remote write checkpointing", the ability for remote write to keep track of where within the WAL it has successfully...
As others have mentioned, it would be a good idea to put this behind a feature flag initially. Have a look at `cmd/prometheus/main.go`, specifically the function `setFeatureListOptions` for how we...
Your Prometheus instance either needs more RAM, or you can delete the WAL directory (you will lose some data). Don't think there's been changes recently that would affect replay memory...
Hi everyone. I appreciate that it is probably frustrating to be running into this problem and see an open issue for it that hasn't been fully fixed, but multiple +1...
cc: @fabxc