apm-server icon indicating copy to clipboard operation
apm-server copied to clipboard

8.7.x fails to startup: open /usr/share/apm-server/certs/ca/ca.crt: permission denied reading

Open niemyjski opened this issue 1 year ago • 2 comments

APM Server version (apm-server version): 8.7.x

Description of the problem including expected versus actual behavior:

We have the following docker-compose file that we used locally for apm: https://github.com/exceptionless/Exceptionless/blob/main/docker/docker-compose.apm.yml This has been working for all version of 8.x up until 8.7.0. When we start with 8.7.0 we get the following error:

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including server configuration, agent(s) used, etc. The easier you make it for us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Copy https://github.com/exceptionless/Exceptionless/blob/main/docker/docker-compose.apm.yml locally
  2. docker compose up

Provide logs (if relevant):

apm-1            | {"log.level":"error","@timestamp":"2023-05-18T16:19:42.327Z","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls.go","file.line":161},"message":"Failed reading CA certificate: open /usr/share/apm-server/certs/ca/ca.crt: permission denied","service.name":"apm-server","ecs.version":"1.6.0"}
apm-1            | {"log.level":"info","@timestamp":"2023-05-18T16:19:42.328Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":159},"message":"stopping apm-server... waiting maximum of 30s for queues to drain","service.name":"apm-server","ecs.version":"1.6.0"}
apm-1            | {"log.level":"error","@timestamp":"2023-05-18T16:19:42.328Z","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls.go","file.line":161},"message":"Failed reading CA certificate: open /usr/share/apm-server/certs/ca/ca.crt: permission denied","service.name":"apm-server","ecs.version":"1.6.0"}
apm-1            | {"log.level":"info","@timestamp":"2023-05-18T16:19:42.328Z","log.origin":{"file.name":"beatcmd/beat.go","file.line":386},"message":"apm-server stopped.","service.name":"apm-server","ecs.version":"1.6.0"}
apm-1            | Error: 1 error: open /usr/share/apm-server/certs/ca/ca.crt: permission denied reading <nil>
apm-1            | Usage:
apm-1            |   apm-server [flags]
apm-1            |   apm-server [command]
apm-1            |

niemyjski avatar May 18 '23 16:05 niemyjski

I'm definitely not an Elastic APM server expert, but I encountered the same bug when using Elastic stack 8.9.0 in Docker. I was trying to use the APM server Docker image (see logs below).

I found a way to fix it in my case, although it is probably not very clean. In the Docker Compose service named setup, I changed the following two lines:

        find . -type d -exec chmod 750 \{\} \;;
        find . -type f -exec chmod 640 \{\} \;;

... for these:

        find . -type d -exec chmod 755 \{\} \;;
        find . -type f -exec chmod 644 \{\} \;;

I suspect that the APM server Docker container's user (apm-server) does not belong to the group that owns the certificate-related files in the certs volume.

APM server Docker container before the fix:

2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.174Z","log.origin":{"file.name":"beatcmd/beat.go","file.line":139},"message":"Home path: [/usr/share/apm-server] Config path: [/usr/share/apm-server] Data path: [/usr/share/apm-server/data] Logs path: [/usr/share/apm-server/logs]","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.179Z","log.origin":{"file.name":"beatcmd/beat.go","file.line":146},"message":"Beat ID: 79611d02-102d-47cb-a7de-2810c1cc25d3","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.185Z","log.logger":"beat","log.origin":{"file.name":"beatcmd/beat.go","file.line":576},"message":"Beat info","service.name":"apm-server","system_info":{"beat":{"path":{"config":"/usr/share/apm-server","data":"/usr/share/apm-server/data","home":"/usr/share/apm-server","logs":"/usr/share/apm-server/logs"},"type":"apm-server","uuid":"79611d02-102d-47cb-a7de-2810c1cc25d3"},"ecs.version":"1.6.0"}}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.188Z","log.logger":"beat","log.origin":{"file.name":"beatcmd/beat.go","file.line":584},"message":"Build info","service.name":"apm-server","system_info":{"build":{"commit":"da10039b89b7e0d64520c9264d0eab3dd1793fcd","time":"2023-07-18T15:29:39.000Z","version":"8.9.0"},"ecs.version":"1.6.0"}}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.188Z","log.logger":"beat","log.origin":{"file.name":"beatcmd/beat.go","file.line":587},"message":"Go runtime info","service.name":"apm-server","system_info":{"go":{"os":"linux","arch":"amd64","max_procs":4,"version":"go1.19.10"},"ecs.version":"1.6.0"}}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.190Z","log.logger":"beat","log.origin":{"file.name":"beatcmd/beat.go","file.line":591},"message":"Host info","service.name":"apm-server","system_info":{"host":{"architecture":"x86_64","boot_time":"2023-08-02T14:28:59Z","containerized":true,"name":"608b1d6f2648","ip":["127.0.0.1/8","172.25.0.7/16"],"kernel_version":"5.10.102.1-microsoft-standard-WSL2","mac":["02:42:ac:19:00:07"],"os":{"type":"linux","family":"debian","platform":"ubuntu","name":"Ubuntu","version":"20.04.6 LTS (Focal Fossa)","major":20,"minor":4,"patch":6,"codename":"focal"},"timezone":"UTC","timezone_offset_sec":0},"ecs.version":"1.6.0"}}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.201Z","log.logger":"beat","log.origin":{"file.name":"beatcmd/beat.go","file.line":620},"message":"Process info","service.name":"apm-server","system_info":{"process":{"capabilities":{"inheritable":null,"permitted":null,"effective":null,"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null},"cwd":"/usr/share/apm-server","exe":"/usr/share/apm-server/apm-server","name":"apm-server","pid":7,"ppid":1,"seccomp":{"mode":"filter","no_new_privs":false},"start_time":"2023-08-03T14:51:40.330Z"},"ecs.version":"1.6.0"}}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.193Z","log.origin":{"file.name":"beatcmd/maxprocs.go","file.line":68},"message":"maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.223Z","log.logger":"config","log.origin":{"file.name":"config/agentconfig.go","file.line":70},"message":"using output.elasticsearch for fetching agent config","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"warn","@timestamp":"2023-08-03T14:51:41.232Z","log.logger":"cfgwarn","log.origin":{"file.name":"tlscommon/config.go","file.line":102},"message":"DEPRECATED: Treating the CommonName field on X.509 certificates as a host name when no Subject Alternative Names are present is going to be removed. Please update your certificates if needed. Will be removed in version: 8.0.0","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.239Z","log.logger":"beater","log.origin":{"file.name":"beater/http.go","file.line":142},"message":"Listening on: [::]:8200","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.247Z","log.origin":{"file.name":"beatcmd/beat.go","file.line":394},"message":"apm-server started.","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.266Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":201},"message":"cgroup memory limit exceed available memory, falling back to the total system memory","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.266Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":221},"message":"MaxConcurrentDecoders set to 245 based on 80 percent of 2.4gb of memory","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.266Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":227},"message":"Transactions.MaxTransactionGroups set to 11974 based on 2.4gb of memory","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.266Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":233},"message":"Transactions.MaxServices set to 2394 based on 2.4gb of memory","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.266Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":239},"message":"ServiceTransactions.MaxGroups for service aggregation set to 2394 based on 2.4gb of memory","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"error","@timestamp":"2023-08-03T14:51:41.281Z","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls.go","file.line":169},"message":"Failed reading CA certificate: open certs/ca/ca.crt: permission denied","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"error","@timestamp":"2023-08-03T14:51:41.280Z","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls.go","file.line":169},"message":"Failed reading CA certificate: open certs/ca/ca.crt: permission denied","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.283Z","log.origin":{"file.name":"beatcmd/beat.go","file.line":396},"message":"apm-server stopped.","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 Error: 1 error: open certs/ca/ca.crt: permission denied reading <nil>
2023-08-03 16:51:41 {"log.level":"info","@timestamp":"2023-08-03T14:51:41.282Z","log.logger":"beater","log.origin":{"file.name":"beater/beater.go","file.line":166},"message":"stopping apm-server... waiting maximum of 30s for queues to drain","service.name":"apm-server","ecs.version":"1.6.0"}
2023-08-03 16:51:41 Usage:
2023-08-03 16:51:41   apm-server [flags]
2023-08-03 16:51:41   apm-server [command]
2023-08-03 16:51:41 
2023-08-03 16:51:41 Available Commands:
2023-08-03 16:51:41   apikey      Manage API Keys for communication between APM agents and server (deprecated)
2023-08-03 16:51:41   export      Export current config
2023-08-03 16:51:41   help        Help about any command
2023-08-03 16:51:41   keystore    Manage secrets keystore
2023-08-03 16:51:41   run         Run APM Server
2023-08-03 16:51:41   test        Test config
2023-08-03 16:51:41   version     Show current version info
2023-08-03 16:51:41 
2023-08-03 16:51:41 Flags:
2023-08-03 16:51:41   -E, --E setting=value      Configuration overwrite
2023-08-03 16:51:41   -N, --N                    Disable actual publishing for testing
2023-08-03 16:51:41   -c, --c string             Configuration file, relative to path.config (default "apm-server.yml")
2023-08-03 16:51:41       --cpuprofile string    Write cpu profile to file
2023-08-03 16:51:41   -d, --d stringArray        Enable certain debug selectors
2023-08-03 16:51:41   -e, --e                    Log to stderr and disable syslog/file output
2023-08-03 16:51:41       --environment string   Set the environment in which the process is running (default "default")
2023-08-03 16:51:41   -h, --help                 help for apm-server
2023-08-03 16:51:41       --httpprof string      Start pprof http server
2023-08-03 16:51:41       --memprofile string    Write memory profile to this file
2023-08-03 16:51:41       --path.config string   Configuration path
2023-08-03 16:51:41       --path.data string     Data path
2023-08-03 16:51:41       --path.home string     Home path
2023-08-03 16:51:41       --path.logs string     Logs path
2023-08-03 16:51:41       --strict.perms         Strict permission checking on config files (default true)
2023-08-03 16:51:41   -v, --v                    Log at INFO level
2023-08-03 16:51:41 
2023-08-03 16:51:41 Use "apm-server [command] --help" for more information about a command.
2023-08-03 16:51:41

fterrani avatar Aug 03 '23 15:08 fterrani

For clarity and completeness, here is the Compose file I used, just in case this could be important for someone. I used the one featured in the following article: https://www.elastic.co/fr/blog/getting-started-with-the-elastic-stack-and-docker-compose

The original Docker Compose file is here: https://github.com/elkninja/elastic-stack-docker-part-one/blob/main/docker-compose.yml

My Docker Compose file modified:

  • to include an APM server Docker container
  • and with added read permission on the certs volume of the setup service:
version: "3.8"

volumes:
  certs:
    driver: local
  esdata01:
    driver: local
  kibanadata:
    driver: local
  metricbeatdata01:
    driver: local
  filebeatdata01:
    driver: local
  logstashdata01:
    driver: local

networks:
  default:
    name: elastic
    external: false

services:
  setup:
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    volumes:
      - certs:/usr/share/elasticsearch/config/certs
    user: "0"
    command: >
      bash -c '
        if [ x${ELASTIC_PASSWORD} == x ]; then
          echo "Set the ELASTIC_PASSWORD environment variable in the .env file";
          exit 1;
        elif [ x${KIBANA_PASSWORD} == x ]; then
          echo "Set the KIBANA_PASSWORD environment variable in the .env file";
          exit 1;
        fi;
        if [ ! -f config/certs/ca.zip ]; then
          echo "Creating CA";
          bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip;
          unzip config/certs/ca.zip -d config/certs;
        fi;
        if [ ! -f config/certs/certs.zip ]; then
          echo "Creating certs";
          echo -ne \
          "instances:\n"\
          "  - name: es01\n"\
          "    dns:\n"\
          "      - es01\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          "  - name: kibana\n"\
          "    dns:\n"\
          "      - kibana\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          > config/certs/instances.yml;
          bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key;
          unzip config/certs/certs.zip -d config/certs;
        fi;
        echo "Setting file permissions"
        chown -R root:root config/certs;
        find . -type d -exec chmod 755 \{\} \;;
        find . -type f -exec chmod 644 \{\} \;;
        echo "Waiting for Elasticsearch availability";
        until curl -s --cacert config/certs/ca/ca.crt https://es01:9200 | grep -q "missing authentication credentials"; do sleep 30; done;
        echo "Setting kibana_system password";
        until curl -s -X POST --cacert config/certs/ca/ca.crt -u "elastic:${ELASTIC_PASSWORD}" -H "Content-Type: application/json" https://es01:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"; do sleep 10; done;
        echo "All done!";
      '
    healthcheck:
      test: ["CMD-SHELL", "[ -f config/certs/es01/es01.crt ]"]
      interval: 1s
      timeout: 5s
      retries: 120

  es01:
    depends_on:
      setup:
        condition: service_healthy
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    labels:
      co.elastic.logs/module: elasticsearch
    volumes:
      - certs:/usr/share/elasticsearch/config/certs
      - esdata01:/usr/share/elasticsearch/data
    ports:
      - ${ES_PORT}:9200
    environment:
      - node.name=es01
      - cluster.name=${CLUSTER_NAME}
      - discovery.type=single-node
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - bootstrap.memory_lock=true
      - xpack.security.enabled=true
      - xpack.security.http.ssl.enabled=true
      - xpack.security.http.ssl.key=certs/es01/es01.key
      - xpack.security.http.ssl.certificate=certs/es01/es01.crt
      - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
      - xpack.security.transport.ssl.enabled=true
      - xpack.security.transport.ssl.key=certs/es01/es01.key
      - xpack.security.transport.ssl.certificate=certs/es01/es01.crt
      - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
      - xpack.security.transport.ssl.verification_mode=certificate
      - xpack.license.self_generated.type=${LICENSE}
    mem_limit: ${ES_MEM_LIMIT}
    ulimits:
      memlock:
        soft: -1
        hard: -1
    healthcheck:
      test:
        [
          "CMD-SHELL",
          "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
        ]
      interval: 10s
      timeout: 10s
      retries: 120

  kibana:
    depends_on:
      es01:
        condition: service_healthy
    image: docker.elastic.co/kibana/kibana:${STACK_VERSION}
    labels:
      co.elastic.logs/module: kibana
    volumes:
      - certs:/usr/share/kibana/config/certs
      - kibanadata:/usr/share/kibana/data
    ports:
      - ${KIBANA_PORT}:5601
    environment:
      - SERVERNAME=kibana
      - ELASTICSEARCH_HOSTS=https://es01:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
      - ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt
      - XPACK_SECURITY_ENCRYPTIONKEY=${ENCRYPTION_KEY}
      - XPACK_ENCRYPTEDSAVEDOBJECTS_ENCRYPTIONKEY=${ENCRYPTION_KEY}
      - XPACK_REPORTING_ENCRYPTIONKEY=${ENCRYPTION_KEY}
    mem_limit: ${KB_MEM_LIMIT}
    healthcheck:
      test:
        [
          "CMD-SHELL",
          "curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'",
        ]
      interval: 10s
      timeout: 10s
      retries: 120

  metricbeat01:
    depends_on:
      es01:
        condition: service_healthy
      kibana:
        condition: service_healthy
    image: docker.elastic.co/beats/metricbeat:${STACK_VERSION}
    user: root
    volumes:
      - certs:/usr/share/metricbeat/certs
      - metricbeatdata01:/usr/share/metricbeat/data
      - "./metricbeat.yml:/usr/share/metricbeat/metricbeat.yml:ro"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro"
      - "/proc:/hostfs/proc:ro"
      - "/:/hostfs:ro"
    environment:
      - ELASTIC_USER=elastic
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - ELASTIC_HOSTS=https://es01:9200
      - KIBANA_HOSTS=http://kibana:5601
      - LOGSTASH_HOSTS=http://logstash01:9600

  filebeat01:
    depends_on:
      es01:
        condition: service_healthy
    image: docker.elastic.co/beats/filebeat:${STACK_VERSION}
    user: root
    volumes:
      - certs:/usr/share/filebeat/certs
      - filebeatdata01:/usr/share/filebeat/data
      - "./filebeat_ingest_data/:/usr/share/filebeat/ingest_data/"
      - "./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
      - "/var/lib/docker/containers:/var/lib/docker/containers:ro"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
    environment:
      - ELASTIC_USER=elastic
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - ELASTIC_HOSTS=https://es01:9200
      - KIBANA_HOSTS=http://kibana:5601
      - LOGSTASH_HOSTS=http://logstash01:9600

  logstash01:
    depends_on:
      es01:
        condition: service_healthy
      kibana:
        condition: service_healthy
    image: docker.elastic.co/logstash/logstash:${STACK_VERSION}
    labels:
      co.elastic.logs/module: logstash
    user: root
    volumes:
      - certs:/usr/share/logstash/certs
      - logstashdata01:/usr/share/logstash/data
      - "./logstash_ingest_data/:/usr/share/logstash/ingest_data/"
      - "./logstash.conf:/usr/share/logstash/pipeline/logstash.conf:ro"
    environment:
      - xpack.monitoring.enabled=false
      - ELASTIC_USER=elastic
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - ELASTIC_HOSTS=https://es01:9200


  apm-server:
    depends_on:
      es01:
        condition: service_healthy
    image: docker.elastic.co/apm/apm-server:${STACK_VERSION}
    user: apm-server
    volumes:
      - certs:/usr/share/apm-server/certs
      - "./apm-server.docker.yml:/usr/share/apm-server/apm-server.yml:ro"
    ports:
      - 8200:8200
    environment:
      - ELASTIC_HOSTS=https://es01:9200
      - ELASTIC_USER=elastic
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}

fterrani avatar Aug 03 '23 15:08 fterrani