vault icon indicating copy to clipboard operation
vault copied to clipboard

Add vault.agent.authenticated metric

Open markafarrell opened this issue 1 year ago • 4 comments

This adds an additional metric to vault agent telemetry that allows you to see if vault agent is currently authenticated and has a valid token.

When the metric is set to 1 it means that the agent has successfully authenticated with the vault server and has a valid token.

When it is set to 0 it means that the agent does not have a valid token.

fixes #26569

markafarrell avatar Apr 22 '24 03:04 markafarrell

The below can be used to demonstrate the new metric

Generate TLS certificates

mkdir -p container-data/vault/tls
openssl req -x509 -nodes -days 9999 -newkey rsa:2048 \
-keyout  container-data/vault/tls/vault_server.key -out  container-data/vault/tls/vault_server.crt \
-subj "/CN=AU/ST=Some-State/L=Some-City/O=Internet Widgits Pty Ltd/OU=Something/\
CN=vault-server" \
-addext "subjectAltName = DNS:vault-server"
chmod g+r container-data/vault/tls/*

Generate configuration

mkdir -p container-data/vault/config

cat <<EOF > container-data/vault/config/vault_main.hcl
ui = true

listener "tcp" {
  address = "[::]:8200"
  cluster_address = "[::]:8201"
  tls_cert_file = "/vault/tls/vault_server.crt"
  tls_key_file  = "/vault/tls/vault_server.key"
}

storage "file" {
  path = "/vault/data"
}
EOF

Start vault

mkdir -p logs

mkdir -p logs/vault

chmod g+w logs/vault

docker network create vault-agent-test

docker run --rm -d -p 8200:8200 -e VAULT_LOG_LEVEL=debug -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault/config/vault_main.hcl:/vault/config/vault_main.hcl -v $PWD/logs/vault:/var/log/vault --cap-add IPC_LOCK --network=vault-agent-test --name=vault-server hashicorp/vault:1.16.1 server

docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 operator init | tee logs/init.log

Extract unseal keys and root token

mkdir -p secrets
grep "Unseal Key" logs/init.log | awk -F':' '{ print $2}' | tr -d ' ' | tee secrets/unseal_keys
grep "Initial Root Token" logs/init.log | awk -F':' '{ print $2}' | tr -d ' ' | tr -d '\n' | tr -d '\r' | tee secrets/root_token

Unseal Vault

for k in $(cat secrets/unseal_keys)
do
    docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 operator unseal $k
done

Enable approle auth

docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 auth enable approle

Create approle

docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 \
vault write auth/approle/role/my-role \
    secret_id_ttl=10m \
    token_num_uses=10 \
    token_ttl=20m \
    token_max_ttl=30m \
    secret_id_num_uses=40

mkdir -p secrets/approle

docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 vault read -field=role_id auth/approle/role/my-role/role-id | tr -d '\n' | tr -d '\r' | tee secrets/approle/role-id; echo

docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 vault write -field secret_id -f auth/approle/role/my-role/secret-id | tr -d '\n' | tr -d '\r' | tee secrets/approle/secret-id; echo

Generate Vault Agent configuration

mkdir -p container-data/vault-agent/config

cat <<EOF > container-data/vault-agent/config/vault-agent-conf.hcl
auto_auth {
  method {
    type = "approle"

    config = {
      role_id_file_path = "/etc/vault/approle/role-id"
      secret_id_file_path = "/etc/vault/approle/secret-id"
    }
  }

  sinks {
    sink {
      type = "file"

      config = {
        path = "/tmp/file-foo"
      }
    }
  }
}
listener "tcp" {
  address = "0.0.0.0:8100"
  tls_disable = true
  unauthenticated_metrics_access = true
}

telemetry {
  disable_hostname = true
}

cache {}
EOF

Start Vault Agent

docker run --rm -d -p 8100:8100 -e VAULT_LOG_LEVEL=debug -e VAULT_ADDR=https://vault-server:8200 -e VAULT_SKIP_VERIFY=TRUE -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault-agent/config/vault-agent-conf.hcl:/vault/config/vault-agent-conf.hcl -v $PWD/secrets/approle:/etc/vault/approle:rw --cap-add IPC_LOCK --network=vault-agent-test --name=vault-agent hashicorp/vault:1.16.1 agent -config /vault/config/vault-agent-conf.hcl
docker logs vault-agent

Get Vault Agent Metrics

curl -s -x '' http://127.0.0.1:8100/agent/v1/metrics?format=prometheus | grep vault_agent_auth
# HELP vault_agent_auth_success vault_agent_auth_success
# TYPE vault_agent_auth_success counter
vault_agent_auth_success 2

Start Modified Vault Agent

docker run --rm -d -p 8100:8100 -e VAULT_LOG_LEVEL=debug -e VAULT_ADDR=https://vault-server:8200 -e VAULT_SKIP_VERIFY=TRUE -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault-agent/config/vault-agent-conf.hcl:/vault/config/vault-agent-conf.hcl -v $PWD/secrets/approle:/etc/vault/approle:rw --cap-add IPC_LOCK --network=vault-agent-test --name=vault-agent vault:dev agent -config /vault/config/vault-agent-conf.hcl
docker logs vault-agent

Get Vault Agent Metrics

curl -s -x '' http://127.0.0.1:8100/agent/v1/metrics?format=prometheus | grep vault_agent_auth
# HELP vault_agent_auth_authenticated vault_agent_auth_authenticated
# TYPE vault_agent_auth_authenticated gauge
vault_agent_auth_authenticated 1
# HELP vault_agent_auth_failure vault_agent_auth_failure
# TYPE vault_agent_auth_failure counter
vault_agent_auth_failure 4
# HELP vault_agent_auth_success vault_agent_auth_success
# TYPE vault_agent_auth_success counter
vault_agent_auth_success 2

markafarrell avatar Apr 22 '24 03:04 markafarrell

Hi @markafarrell, thank you so much for your PR! Before we proceed - I'd love to understand your use case a little bit better. Currently, we can see when authentication has succeeded and the agent has a valid token in the server logs (ie. https://github.com/hashicorp/vault/blob/main/command/agentproxyshared/auth/auth.go#L480 ). Is there a reason that telemetry might better serve your needs than the server logs?

divyaac avatar Apr 22 '24 20:04 divyaac

@divyaac Having a metric makes it much easier to integrate with alerting tools like Prometheus alert manager.

Then you can get an alert when that metric goes to zero so you can promptly act. Instead of having to look at logs to see that the agent is not authenticated

markafarrell avatar Apr 22 '24 21:04 markafarrell

@divyaac Having a metric makes it much easier to integrate with alerting tools like Prometheus alert manager.

Then you can get an alert when that metric goes to zero so you can promptly act. Instead of having to look at logs to see that the agent is not authenticated

Thanks for your response @markafarrell . I think adding this metric would make sense. After addressing the comments we should be able to get move this PR along!

divyaac avatar Apr 23 '24 18:04 divyaac

Thanks for this! I chatted with @divyaac and she's approved it, I resolved the merge conflicts (I think they were mostly my fault!) and I'll try and get this merged if everything passes. Great work :D

VioletHynes avatar May 28 '24 15:05 VioletHynes