vault
vault copied to clipboard
Add vault.agent.authenticated metric
This adds an additional metric to vault agent telemetry that allows you to see if vault agent is currently authenticated and has a valid token.
When the metric is set to 1 it means that the agent has successfully authenticated with the vault server and has a valid token.
When it is set to 0 it means that the agent does not have a valid token.
fixes #26569
The below can be used to demonstrate the new metric
Generate TLS certificates
mkdir -p container-data/vault/tls
openssl req -x509 -nodes -days 9999 -newkey rsa:2048 \
-keyout container-data/vault/tls/vault_server.key -out container-data/vault/tls/vault_server.crt \
-subj "/CN=AU/ST=Some-State/L=Some-City/O=Internet Widgits Pty Ltd/OU=Something/\
CN=vault-server" \
-addext "subjectAltName = DNS:vault-server"
chmod g+r container-data/vault/tls/*
Generate configuration
mkdir -p container-data/vault/config
cat <<EOF > container-data/vault/config/vault_main.hcl
ui = true
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/tls/vault_server.crt"
tls_key_file = "/vault/tls/vault_server.key"
}
storage "file" {
path = "/vault/data"
}
EOF
Start vault
mkdir -p logs
mkdir -p logs/vault
chmod g+w logs/vault
docker network create vault-agent-test
docker run --rm -d -p 8200:8200 -e VAULT_LOG_LEVEL=debug -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault/config/vault_main.hcl:/vault/config/vault_main.hcl -v $PWD/logs/vault:/var/log/vault --cap-add IPC_LOCK --network=vault-agent-test --name=vault-server hashicorp/vault:1.16.1 server
docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 operator init | tee logs/init.log
Extract unseal keys and root token
mkdir -p secrets
grep "Unseal Key" logs/init.log | awk -F':' '{ print $2}' | tr -d ' ' | tee secrets/unseal_keys
grep "Initial Root Token" logs/init.log | awk -F':' '{ print $2}' | tr -d ' ' | tr -d '\n' | tr -d '\r' | tee secrets/root_token
Unseal Vault
for k in $(cat secrets/unseal_keys)
do
docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 operator unseal $k
done
Enable approle auth
docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 auth enable approle
Create approle
docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 \
vault write auth/approle/role/my-role \
secret_id_ttl=10m \
token_num_uses=10 \
token_ttl=20m \
token_max_ttl=30m \
secret_id_num_uses=40
mkdir -p secrets/approle
docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 vault read -field=role_id auth/approle/role/my-role/role-id | tr -d '\n' | tr -d '\r' | tee secrets/approle/role-id; echo
docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 vault write -field secret_id -f auth/approle/role/my-role/secret-id | tr -d '\n' | tr -d '\r' | tee secrets/approle/secret-id; echo
Generate Vault Agent configuration
mkdir -p container-data/vault-agent/config
cat <<EOF > container-data/vault-agent/config/vault-agent-conf.hcl
auto_auth {
method {
type = "approle"
config = {
role_id_file_path = "/etc/vault/approle/role-id"
secret_id_file_path = "/etc/vault/approle/secret-id"
}
}
sinks {
sink {
type = "file"
config = {
path = "/tmp/file-foo"
}
}
}
}
listener "tcp" {
address = "0.0.0.0:8100"
tls_disable = true
unauthenticated_metrics_access = true
}
telemetry {
disable_hostname = true
}
cache {}
EOF
Start Vault Agent
docker run --rm -d -p 8100:8100 -e VAULT_LOG_LEVEL=debug -e VAULT_ADDR=https://vault-server:8200 -e VAULT_SKIP_VERIFY=TRUE -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault-agent/config/vault-agent-conf.hcl:/vault/config/vault-agent-conf.hcl -v $PWD/secrets/approle:/etc/vault/approle:rw --cap-add IPC_LOCK --network=vault-agent-test --name=vault-agent hashicorp/vault:1.16.1 agent -config /vault/config/vault-agent-conf.hcl
docker logs vault-agent
Get Vault Agent Metrics
curl -s -x '' http://127.0.0.1:8100/agent/v1/metrics?format=prometheus | grep vault_agent_auth
# HELP vault_agent_auth_success vault_agent_auth_success
# TYPE vault_agent_auth_success counter
vault_agent_auth_success 2
Start Modified Vault Agent
docker run --rm -d -p 8100:8100 -e VAULT_LOG_LEVEL=debug -e VAULT_ADDR=https://vault-server:8200 -e VAULT_SKIP_VERIFY=TRUE -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault-agent/config/vault-agent-conf.hcl:/vault/config/vault-agent-conf.hcl -v $PWD/secrets/approle:/etc/vault/approle:rw --cap-add IPC_LOCK --network=vault-agent-test --name=vault-agent vault:dev agent -config /vault/config/vault-agent-conf.hcl
docker logs vault-agent
Get Vault Agent Metrics
curl -s -x '' http://127.0.0.1:8100/agent/v1/metrics?format=prometheus | grep vault_agent_auth
# HELP vault_agent_auth_authenticated vault_agent_auth_authenticated
# TYPE vault_agent_auth_authenticated gauge
vault_agent_auth_authenticated 1
# HELP vault_agent_auth_failure vault_agent_auth_failure
# TYPE vault_agent_auth_failure counter
vault_agent_auth_failure 4
# HELP vault_agent_auth_success vault_agent_auth_success
# TYPE vault_agent_auth_success counter
vault_agent_auth_success 2
Hi @markafarrell, thank you so much for your PR! Before we proceed - I'd love to understand your use case a little bit better. Currently, we can see when authentication has succeeded and the agent has a valid token in the server logs (ie. https://github.com/hashicorp/vault/blob/main/command/agentproxyshared/auth/auth.go#L480 ). Is there a reason that telemetry might better serve your needs than the server logs?
@divyaac Having a metric makes it much easier to integrate with alerting tools like Prometheus alert manager.
Then you can get an alert when that metric goes to zero so you can promptly act. Instead of having to look at logs to see that the agent is not authenticated
@divyaac Having a metric makes it much easier to integrate with alerting tools like Prometheus alert manager.
Then you can get an alert when that metric goes to zero so you can promptly act. Instead of having to look at logs to see that the agent is not authenticated
Thanks for your response @markafarrell . I think adding this metric would make sense. After addressing the comments we should be able to get move this PR along!
Thanks for this! I chatted with @divyaac and she's approved it, I resolved the merge conflicts (I think they were mostly my fault!) and I'll try and get this merged if everything passes. Great work :D