docs icon indicating copy to clipboard operation
docs copied to clipboard

alerting: provide more examples for alert rules

Open kayrus opened this issue 8 years ago • 20 comments

I.e. most common (critical and warning severity):

  • disk usage
  • memory usage
  • CPU usage
  • kubelet limit for pods per node
  • etc.

kayrus avatar Oct 06 '16 08:10 kayrus

There is also http://tips.robustperception.io/ , where concrete examples are supposed to live.

beorn7 avatar Oct 06 '16 10:10 beorn7

@beorn7 try to google for "prometheus disk alert example" and you'll find nothing about this page.

Also when I tried to search for "alert" keyword (http://tips.robustperception.io/search:site/q/alert) I got An error occurred when processing your request.

kayrus avatar Oct 06 '16 10:10 kayrus

I'm not saying the solution is already on that site, or that the site is already in perfect operational condition. It's just the thing that @brian-brazil recently set up for Prometheus tips. When addressing the issue you have filed, we should think how many examples we want to add to the core docs (which are supposed to be very concise) and which might be better suited on the tips site.

beorn7 avatar Oct 06 '16 13:10 beorn7

To be clear I want the reference docs to stay as reference docs, I'm all for different types of docs being on prometheus.io if we can find a sane way to approach it.

brian-brazil avatar Oct 06 '16 13:10 brian-brazil

@beorn7 don't think about which ones, just shoot. users need examples. and complete examples, try to avoid taken out of context examples.

kayrus avatar Oct 06 '16 14:10 kayrus

Coming from nagios, this is what lacks from my perception. I'm totally sold for most parts of prometheus, but I'm missing not only alert examples, I'm missing sane defaults (could be opt-in) for alerts. This would help lowering the bar for ppl who want to migrate from other solutions.

varac avatar Mar 01 '17 21:03 varac

+1

arj22 avatar Jun 02 '17 17:06 arj22

@juliusv has started an initiative to do this and started with metrics from the node_exporter.

brancz avatar Jun 03 '17 07:06 brancz

can anyone give some alert examples or some specific link, I am kind of stuck at setting alerts

imickeyj avatar Sep 21 '17 11:09 imickeyj

+1

myfreax avatar Sep 23 '17 07:09 myfreax

+1

4220182 avatar Sep 27 '17 13:09 4220182

@imickeyj For now, you can find plenty of alerting rule examples in GitLab's alerting configs: https://gitlab.com/gitlab-com/runbooks/tree/master/alerts

juliusv avatar Sep 27 '17 13:09 juliusv

At SAP we use Prometheus for quite a while now. Some inspiration for alerting rules can be found here.

auhlig avatar Sep 27 '17 15:09 auhlig

@juliusv @auhlig Great sharing!

Still best wish team to add more alerting rules to sources and documents. Thanks!

tangyong avatar Feb 07 '18 01:02 tangyong

We agree with all of this and acknowledge that there is a need for this. The problem today is that there is no and likely will never be a standardized labeling of things, and that's probably a good thing. @tomwilkie and I have been extensively talking about some solutions to this and we hope that soon we will have a proposal out for actually shareable alerting rules and dashboard definitions.

Whether this should be on prometheus.io is in my opinion questionable, my opinion is that the respective "bundle" should be in the repository of the application itself that they describe, similar to how the etcd project has alerting rules and dashboard definitions in the op-guide. As I mentioned the how reusable these actually are today with a lot of assumptions on labeling is questionable, but we are trying to solve that problem.

brancz avatar Feb 07 '18 08:02 brancz

@brian-brazil Are you planning to integrate alertmanager with hipchat. It works well with slack but we do not get nice readable format with hipchat alerts. image

gauravgoyal0086 avatar Mar 15 '18 15:03 gauravgoyal0086

@gauravgoyal0086 Please do not ask support questions on unrelated issues.

brian-brazil avatar Mar 15 '18 15:03 brian-brazil

I am sorry. I thought to ask it here as this thread is related to alerts. Please suggest to open a new issue for alertmanager + hipchat ?

gauravgoyal0086 avatar Mar 15 '18 15:03 gauravgoyal0086

Please use the prometheus-users mailing list.

brian-brazil avatar Mar 15 '18 15:03 brian-brazil

http://tips.robustperception.io is down

piotrkochan avatar Oct 18 '18 09:10 piotrkochan