uptime-kuma
uptime-kuma copied to clipboard
Add tags as labels to prometheus metrics
Description
Fixes #680 - Once working, this should add any tags added in the UI as labels for Prometheus
Type of change
Please delete options that are not relevant.
- New feature (non-breaking change which adds functionality)
Checklist
- [x] My code follows the style guidelines of this project
- [x] I ran ESLint and other linters for modified files
- [x] I have performed a self-review of my own code and test it
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] My changes generate no new warnings
- [ ] My code needed automated testing. I have added them (this is optional task)
Screenshots (if any)
Please do not use any external image service. Instead, just paste in or drag and drop the image here and it will be uploaded automatically.
@chakflying / @louislam - I'm not sure how to do this in NodeJS, hopefully this is a reasonable start to show the kind of thing I'm trying to achieve?
Instead of
monitor_status{monitor_name="news.bbc.co.uk",monitor_type="http",monitor_url="https://news.bbc.co.uk",monitor_hostname="null",monitor_port="null"} 1
then I should see
monitor_status{monitor_name="news.bbc.co.uk",monitor_type="http",monitor_url="https://news.bbc.co.uk",monitor_hostname="null",monitor_port="null", my_tag_1="my_value_1", my_tag_2="my_value_2"} 1
Does that make sense?
If that's the data structure you want, you would probably do this:
for (const tag of monitor.tags) {
this.monitorLabelValues[tag.name] = tag.value
}
Looks like the tags aren't part of the monitor
object?
console.log(monitor);
returns the following
Monitor {
beanMeta: BeanMeta {
noCache: false,
fetchAs: '',
alias: '',
via: '',
withCondition: '',
withConditionData: []
},
_id: 5,
_name: 'news.bbc.co.uk',
_active: 1,
_userId: 1,
_interval: 60,
_url: 'https://news.bbc.co.uk',
_type: 'http',
_weight: 2000,
_hostname: null,
_port: null,
_createdDate: '2021-11-09 15:40:17',
_keyword: null,
_maxretries: 0,
_ignoreTls: 0,
_upsideDown: 0,
_maxredirects: 10,
_acceptedStatuscodesJson: '["200-299"]',
_dnsResolveType: 'A',
_dnsResolveServer: '1.1.1.1',
_dnsLastResult: null,
_retryInterval: 60,
_pushToken: null,
_method: 'GET',
_body: null,
_headers: null
}
however I'd expect it to include two tags based on the data in the UI:
Please use ESLint!
You are using
var
in your code...
Yup, I learned Javascript nearly 20 years ago and spend most of my life writing other languages. I had no idea that var
had been replaced. I've added eslint
to my codebase and fixed the syntax issues, however it doesn't fix the actual issue that the tags in monitor.tags
don't show up as prometheus labels.
Any ideas how I get this working?
Currently we use the ORM in a very raw way, meaning that table joins have to be done manually. You can check monitor.js:52
for how this is done. But it does seem a bit weird that we need to have another DB query everytime we want to use tags. Maybe @louislam will know if the ORM can do this automatically.
@louislam - I'm hoping to find time to come back to this shortly, is the ORM capable of doing the links or should I just go for a raw query?
@chakflying - I've just found time to come back to this.
I've added some code that does a lookup and pulls back the tags, however I can't seem to get them to be passed through to the correct dict.
In the logs, I get the following:
Getting Tags for Prometheus
TAGS: [object Promise]
Found the following tags for 1 :
[
{
id: 1,
monitor_id: 1,
tag_id: 1,
value: 'garage',
name: 'location',
color: '#4B5563'
},
{
id: 2,
monitor_id: 1,
tag_id: 2,
value: 'se-1',
name: 'region',
color: '#DC2626'
}
]
but I'd expect to see all of the tags listed on the line that starts TAGS
, rather than a promise?
I've tried to add an await
to the call to get_tags(monitor)
, but this results in the following error:
/home/mmw/Projects/uptime-kuma/server/prometheus.js:58
let tags = await this.get_tags(monitor);
^^^^^
SyntaxError: await is only valid in async functions and the top level bodies of modules
at Object.compileFunction (node:vm:352:18)
at wrapSafe (node:internal/modules/cjs/loader:1026:15)
at Module._compile (node:internal/modules/cjs/loader:1061:27)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1151:10)
at Module.load (node:internal/modules/cjs/loader:975:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Module.require (node:internal/modules/cjs/loader:999:19)
at require (node:internal/modules/cjs/helpers:102:18)
at Object.<anonymous> (/home/mmw/Projects/uptime-kuma/server/model/monitor.js:8:24)
at Module._compile (node:internal/modules/cjs/loader:1097:14)
Any idea how I can get around this?
OK, the latest code now takes specific tags and applies them as labels to Prometheus metrics.
This isn't ideal, because I'd really like to have the labels added dynamically, but it's a constraint from the upstream Prom-client package, so it will have to do for now.
Tags that will automatically be turned into labels are:
"location",
"region",
"datacenter",
"cloud_provider",
"az",
"rack",
"shelf",
"room",
"floor"
and new ones can be added by updating https://github.com/louislam/uptime-kuma/pull/898/files#diff-a2ea08464c146b6af2888bdf744cb4789cac4bcda6731289d02431bcc9451363R10-R18
Once this is merged, I'll update the wiki accordingly.
Tags that do not meet one of the above names are ignored silently
@chakflying / @louislam - any chance you can take a look at this please?
Looking for a tester who are using prometheus to test this pull request.
@louislam I should be able to take a look this evening or some time tomorrow and feedback with the results.
Intermediary test results
Here are some intermediary results from my testing so far. Please note that I haven't quite finished testing the PR yet so I will provide some more updates probably tomorrow.
Test environment
Server
- OS: Debain 11
- Kernel: Linux 5.10.0-11-amd64
- Platform: VMware
- Architecture: x86-64
-
node -v
: v17.4.0 -
pm2 -v
: 5.1.2 - Previously ran Uptime Kuma 1.13.1 in production using pm2
Client
- OS: Kali GNU/Linux Rolling 2021.1
- Kernel: Linux 5.10.0-kali7-amd64
- Platform: Acer laptop
- Architecture: x86-64
- Browser: 99.0.4844.82 (Official Build) (64-bit)
- JavaScript: V8 9.9.115.9
- User agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.82 Safari/537.36
Results
Hmm, I seem to be getting some slightly odd results. There are a couple of things that seem a touch unusual. When I add a tag to a monitor, the prometheus metric gets repeated, one with the tag included and one without.
monitor_cert_days_remaining{monitor_name="google.com",monitor_type="http",monitor_url="https://google.com",monitor_hostname="null",monitor_port="null"} 61
monitor_cert_days_remaining{monitor_name="google.com",monitor_type="http",monitor_url="https://google.com",monitor_hostname="null",monitor_port="null",location="google"} 61
Another thing I noticed, when I create a new monitor with a tag, it doesn't appear in the prometheus metrics and only appears if I edit the monitor and resave it.
Steps to reproduce
- Create a new monitor and assign a tag from the list in a previous comment
- Save the monitor
- Visit the /metrics page and note that the new monitor does not contain the tag
- Edit the monitor and resave it
- Visit /metrics again and note that the previous metric without the tag is still present but a new one with the tag has also been added.
Note that these have all been tested with a database that previously came from 1.13.1 before issuing the following commands:
gh pr checkout 898
npm install
npm run build
pm2 restart
I will try again with a clean database and clean node_modules
directory.
Hmm, I seem to be getting some slightly odd results. There are a couple of things that seem a touch unusual. When I add a tag to a monitor, the prometheus metric gets repeated, one with the tag included and one without.
monitor_cert_days_remaining{monitor_name="google.com",monitor_type="http",monitor_url="https://google.com",monitor_hostname="null",monitor_port="null"} 61 monitor_cert_days_remaining{monitor_name="google.com",monitor_type="http",monitor_url="https://google.com",monitor_hostname="null",monitor_port="null",location="google"} 61
Another thing I noticed, when I create a new monitor with a tag, it doesn't appear in the prometheus metrics and only appears if I edit the monitor and resave it.
This does seem to persist after I deleted the database and dist folders as well as node_modules
and stoped the pm2 service and then ran the following.
npm install
npm run build
npm start
Although when the service is restarted the duplication disappears. I presume that this is not so much an issue with the PR but more an issue with the way the Prometheus metrics page is constructed. Perhaps a slightly bigger issue is when the value of a tag is modified, unless the server is restarted, we end up with a metric with the old tag and a repeat with the new tag. This repeats for every change until the server is restarted so if say 8 changes are made then there will end up being 8 copies of the metric in the response. Although, something interesting to note, only the most recent metric is updated, all of the previous ones remain at the value they were on when the settings were updated.
Should probably rebase on master first, #1136 should have fixed the duplicate metrics issue, but I'm guessing it still doesn't handle editing tags correctly.
@chakflying I tried it again after rebasing but as you suspected, it doesn't seem to handle the changes in tags and the issue with the tag only appearing after second save is still present.
I checked the source code. Since addMonitorTag
is a separated call which I guess it called after addMonitor
. That means when the monitor saved and restarted internally, at this moment, those tags haven't added yet.
https://github.com/louislam/uptime-kuma/blob/4df147786da8835d4260dde115937935bd643df1/server/server.js#L928-L930
Wow, thanks, I'd completely missed the updates on this for some reason, great to see it making progress!
Apologies, life got in the way, this has now been rebased on master, hopefully it can be merged soon?
Please fix ESLint warnings
@Saibamen - this is now done. If we're going to enforce the fixing of warnings as well, surely we should update the CI so it fails on that too?
I'd not bothered to check the logs because the tests were green, so I missed that there were lint warnings too. :(
@Computroniks - I'd appreciate if you could test this again for me, especially with the issue you're seeing around tags etc. as I'm not quite sure I've followed that part of the conversation.
Sure, I will take another look tonight
I checked the source code. Since
addMonitorTag
is a separated call which I guess it called afteraddMonitor
. That means when the monitor saved and restarted internally, at this moment, those tags haven't added yet.https://github.com/louislam/uptime-kuma/blob/4df147786da8835d4260dde115937935bd643df1/server/server.js#L928-L930
I think this issue is not fixed yet?
Recap: The tag(s) is/are not active for the first save.
I checked the source code. Since
addMonitorTag
is a separated call which I guess it called afteraddMonitor
. That means when the monitor saved and restarted internally, at this moment, those tags haven't added yet. https://github.com/louislam/uptime-kuma/blob/4df147786da8835d4260dde115937935bd643df1/server/server.js#L928-L930I think this issue is not fixed yet?
Recap: The tag(s) is/are not active for the first save.
Oh, ok, thanks, this is the bit I've not quite followed.
Is there a "quick" solution to this I can implement? Should this logic be added somehow to addMonitor
instead?
We are clearing up our old Pull Requests and yours has been open for 3 months with no activity. Remove stale label or comment or this will be closed in 2 days.
I could really do with some help here from either @Computroniks or @louislam as I'm well out of my depth!
I don't exactly remember the flow of prometheus endpoint.
I think you can try to recreate the Prometheus
object in the addMonitorTag
event
- [/server/model/monitor.js] add a function maybe call
monitor.refreshPrometheus()
- [/server/server.js] inside
addMonitorTag
, get the monitor (server.monitorList[monitorID]) and callmonitor.refreshPrometheus()
Thanks, I'll find some time to give that a go
@proffalken When you get time to continue, would love to see some specific labels/tags that are often used:
-
identifier
,hostname
ornodename
Often used to join to other metrics in dashboard. Maybeidentifier
would be sufficient. At least that would be my preference 😉 Because it is open to what exactly it represents and could be relabeled in Prometheus to the specific name by everyone. -
depends_on
can be used by alert manager to make rules for dependencies of services or devices like gateways.