sql_exporter
sql_exporter copied to clipboard
Azure Active Directory authentication (AAD)
Is your feature request related to a problem? Please describe. I would like to be able to authenticate with Azure SQL via AAD
Describe the solution you'd like
https://github.com/denisenkom/go-mssqldb#azure-active-directory-authentication
Hi @reubano, thanks for the suggestion. We would need to provide a pull request to the upstream to include azuresql
schema to xo/dburl
to recognise the DSN. This seems to be a small change, so the draft PR is created already.
After it's merged, you will need add/replace the package by the subpackage azuread
as in the example in your link and rebuild the package.
It'd be great to test it before adding this subpackage by default.
I'll keep you posted.
Kind regards, Sergei
UPD: pointed at the wrong line initially.
Reference: https://github.com/xo/dburl/pull/21
Hi @reubano, the aforementioned PR is merged. I've also updated the version of dburl on our side. So, to test whether it works, you need to do the following:
- clone the current master
- update the package as per link above (simply add
/azuread
) - run
make build
to produce the binary - try it out with your configuration file
Please let me know if you need any assistance. Unfortunately, I don't have access to any Azure infrastructure now to try it out myself.
Thanks! I'm waiting to get access to a component in my app before I can test this. I'll keep you posted!
Getting errors. Not sure of the root cause, but I filed an issue at https://github.com/xo/usql/issues/380.
Hey @reubano, it seems like the issue was fixed at the end of the year (just checked). sql_exporter v0.9.2
is also using the latest dburl v0.13
dependency. Please have a look. 👍
@burningalchemist, it looks like this issue is resolved by the recent code change you just made (and described in #187), right? (I wanted to note this if that was actually the case here. Feel free to delete my comment if this is inaccurate.)
@jdstone Yes, the issue is resolved. However I remember @reubano had an issue with a system-assigned identity. Currently I assume it's either on the Azure side configuration or the driver itself.
@reubano we're happy to help, I guess. :) 👍
I think I got it. Took a while to obtain the proper Azure authorization. I'm using v0.11.1 now with the azuresql
protocol. Restarted the systemd
service and no errors so far.
Glad to hear the issue is resolved! :)
Guess I spoke too soon. curl http://localhost:9399/metrics
outputs the following:
No metrics gathered, [from Gatherer #1] unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: connect: connection refused
I'm not sure why its trying to connect on localhost.
systemctl status sql-exporter
● sql-exporter.service - SQL Exporter
Loaded: loaded (/etc/systemd/system/sql-exporter.service; indirect; vendor preset: enabled)
Active: active (running) since Tue 2023-08-15 19:46:27 UTC; 25s ago
Main PID: 21235 (sql_exporter)
Tasks: 4 (limit: 4069)
CGroup: /system.slice/sql-exporter.service
└─21235 /usr/bin/sql_exporter -log.level debug -config.file sql_exporter.yml
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.680Z caller=klog.go:84 level=debug func=Infof msg="Loaded collector 'worklist_counts' from /home/azureuser/prometheus/files/worklist_counts.collector.yml"
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.682Z caller=klog.go:84 level=debug func=Infof msg="Loaded collector 'worklist_metrics' from /home/azureuser/prometheus/files/worklist_metrics.collector.yml"
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.682Z caller=klog.go:55 level=debug func=Verbose.Infof msg="[collector=\"worklist_counts\"] Non-zero min_interval (10m), using cached collector."
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.685Z caller=klog.go:55 level=debug func=Verbose.Infof msg="[collector=\"worklist_metrics\"] Non-zero min_interval (10m), using cached collector."
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.685Z caller=klog.go:96 level=warn func=Warning msg="Listening on :9399"
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.692Z caller=tls_config.go:232 level=info msg="Listening on" address=[::]:9399
Aug 15 19:48:15 xxx sql_exporter[21582]: ts=2023-08-15T19:48:15.692Z caller=tls_config.go:235 level=info msg="TLS is disabled." http2=false address=[::]:9399
Aug 15 19:48:25 xxx sql_exporter[21582]: ts=2023-08-15T19:48:25.460Z caller=klog.go:84 level=debug func=Infof msg="Database handle successfully opened with 'sqlserver' driver"
Aug 15 19:48:25 xxx sql_exporter[21582]: ts=2023-08-15T19:48:25.460Z caller=klog.go:84 level=debug func=Infof msg="Error gathering metrics: [from Gatherer #1] unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: connect: connection refused"
Aug 15 19:49:25 xxx sql_exporter[21582]: ts=2023-08-15T19:49:25.460Z caller=klog.go:84 level=debug func=Infof msg="Error gathering metrics: [from Gatherer #1] unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: connect: connection refused"
sql_exporter.yml
# Global settings and defaults.
global:
# Subtracted from Prometheus' scrape_timeout to give us some headroom and prevent Prometheus from
# timing out first.
scrape_timeout_offset: 500ms
# Minimum interval between collector runs: by default (0s) collectors are executed on every scrape.
min_interval: 600s
# Maximum number of open connections to any one target. Metric queries will run concurrently on
# multiple connections.
max_connections: 3
# Maximum number of idle connections to any one target.
max_idle_connections: 3
# The target to monitor and the list of collectors to execute on it.
target:
# Data source name always has a URI schema that matches the driver name. In some cases (e.g. MySQL)
# the schema gets dropped or replaced to match the driver expected DSN format.
data_source_name: 'azuresql://XXX.database.windows.net?database=XXX&fedauth=ActiveDirectoryManagedIdentity'
# Collectors (referenced by name) to execute on the target.
collectors: [worklist_*]
collector_files:
- "*.collector.yml"
It's at least logging in because if I remove fedauth=ActiveDirectoryManagedIdentity
it errors with No metrics gathered, [from Gatherer #1] mssql: login error: Login failed for user ''.
Hmmm, seems to be a downstream issue. I see the same error with [email protected] while [email protected] works fine.
The issue was fixed in xo/dburl here.
Hey @reubano, I followed the thread, I'll release an update shortly. 👍
Still no luck :(
$ curl http://localhost:9399/metrics
No metrics gathered, [from Gatherer #1] mssql: login error: Login failed for user ''.
$ sql_exporter --version
sql_exporter, version 0.12.0 (branch: HEAD, revision: e934efa67f5702a96f6b6a1632b7e78fc678d66b)
build user: root@1e5240ff6c6d
build date: 20230817-16:40:52
go version: go1.20.7
platform: linux/amd64
tags: netgo
$ systemctl status sql-exporter
● sql-exporter.service - SQL Exporter
Loaded: loaded (/etc/systemd/system/sql-exporter.service; indirect; vendor preset: enabled)
Active: active (running) since Mon 2023-08-21 19:46:27 UTC; 25s ago
Main PID: 21235 (sql_exporter)
Tasks: 4 (limit: 4069)
CGroup: /system.slice/sql-exporter.service
└─21235 /usr/bin/sql_exporter -log.level debug -config.file sql_exporter.yml
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.680Z caller=klog.go:84 level=debug func=Infof msg="Loaded collector 'worklist_counts' from /home/azureuser/prometheus/files/worklist_counts.collector.yml"
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.682Z caller=klog.go:84 level=debug func=Infof msg="Loaded collector 'worklist_metrics' from /home/azureuser/prometheus/files/worklist_metrics.collector.yml"
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.682Z caller=klog.go:55 level=debug func=Verbose.Infof msg="[collector=\"worklist_counts\"] Non-zero min_interval (10m), using cached collector."
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.685Z caller=klog.go:55 level=debug func=Verbose.Infof msg="[collector=\"worklist_metrics\"] Non-zero min_interval (10m), using cached collector."
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.685Z caller=klog.go:96 level=warn func=Warning msg="Listening on :9399"
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.692Z caller=tls_config.go:232 level=info msg="Listening on" address=[::]:9399
Aug 20 19:48:15 xxx sql_exporter[21582]: ts=2023-08-20T19:48:15.692Z caller=tls_config.go:235 level=info msg="TLS is disabled." http2=false address=[::]:9399
Aug 20 19:48:25 xxx sql_exporter[21582]: ts=2023-08-20T19:48:25.460Z caller=klog.go:84 level=debug func=Infof msg="Database handle successfully opened with 'sqlserver' driver"
Aug 20 19:48:25 xxx sql_exporter[21582]: ts=2023-08-20T19:48:25.460Z caller=klog.go:84 level=debug func=Infof msg="Error gathering metrics: [from Gatherer #1] mssql: login error: Login failed for user ''."
Aug 20 19:49:25 xxx sql_exporter[21582]: ts=2023-08-20T19:49:25.460Z caller=klog.go:84 level=debug func=Infof msg="Error gathering metrics: [from Gatherer #1] mssql: login error: Login failed for user ''."
usql
works though (which also uses dburl v0.16.0
)
$ go install -tags sqlserver github.com/xo/[email protected]
go: downloading github.com/xo/usql v0.15.0
$ usql 'azuresql://xxx.database.windows.net?database=xxx&fedauth=ActiveDirectoryMSI'
Connected with driver sqlserver (Microsoft SQL Server x.x, RTM, SQL Azure)
Type "help" for help.
ms:xxx=>
The last sql_exporter
version I got to work was 0.9.3
.
It actually seems like 0.9.3
suffers from https://github.com/burningalchemist/sql_exporter/issues/179 since I'm seeing lots of prepare query failed: context deadline exceeded
in the logs.
Let me know if there's anything I can do to help out. I'm not a go coder, but I can try to get by.
Hey @reubano, It's a common error (context deadline exceeded
) in case when the connection can't be established.
I see the error on your end in the logs: mssql: login error: Login failed for user ''
. Might it be related to some authentication issue or you can still successfully login with the old version?
I guess I see the reason:
Database handle successfully opened with 'sqlserver' driver"
, whereas we expect azuresql
to be the right scheme to use AAD. I'll check that part and release a patch.
@reubano it's released, please have a look. I expect things to work now. :))
It's good to go now! Thanks again!!
For anyone else finding this post after searching context deadline exceeded
, that error can also mean that the query timed out. To fix it, I increased the scrape_timeout
parameter for both prometheus
and sql_exporter
so that they are longer than the expected query time.