statusbot
statusbot copied to clipboard
Monitor your uptime automagically with statuspage.io
StatusBot
I check URLs and report their status to statuspage.io.
This powers status.taskrabbit.com
.
This tool will gather response times for your METRICS.
METRICS are linked to a COMPONENT, and will trigger the COMPONENT status as the metric changes. For example if you have METRIC "api_user_search_response_time" and the COMPONENT "API", you can setup StatusBot to note the "API" as having an outage if your metric check for the related endpoint to "api_user_search_response_time" is either slow or does not return a 200 OK
- If a metric check is slow (
check.threshold
), the related component is noted asdegraded performance
- If there is an HTTP error checking the metric, the related component is noted as having a
partial_outage
- You configure how many times an metric must fail before the outage is triggered (default 10).
- If you don't want to create an incident for a metric, set
check.impact = 'none'
- StatusBot will never close an incident for you. This should be done manually via your
statuspage.io
dashboard, so you can add incident notes.
Install
-
npm install
- configure your checks in
/config/statuspage.js
(copy from/config/statuspage.example.js
) -
npm start
StatusBot is an actionhero.js project. Visit www.actionherojs.com for more information.
Notes
- As this is an actionhero project, you can do the following:
- Manually run the checks with the action
check
- You can run this action via HTTP, Telnet, and sockets if you configure the server to enable these transports.
- Deploy a cluster of statusBots that all share the same redis instance and all the checks will be distributed between them.
- Manually run the checks with the action
- You can configure the frequency at which all URLs are checked within the
checkAll
task by modifyingtask.frequency
. The default is 10 seconds.- To account for
statuspage.io
's rate limiting, we will wait 5 seconds before each check within thecheckAll
task. This is not configurable.
- To account for
- You can configure the default message new incidents are created at within
/config/errors.js
TODO
- configure the workers so they fairly round-robin the checks between all available nodes.
- allow for regexp-based body checks
- these regexep checks might power the incident messaging