kibana icon indicating copy to clipboard operation
kibana copied to clipboard

[Observability Onboarding] Catch 22 when onboarding to APM (?)

Open dkarlovi opened this issue 1 year ago • 4 comments

Kibana version:

8.13.0

Elasticsearch version:

8.13.0

Server OS version:

Docker

Browser version:

Google Chrome 124.0.6367.118 (Official Build) (64-bit)

Browser OS version:

Fedora Linux 40

Original install method (e.g. download page, yum, from source, etc.):

Docker image

Describe the bug:

When onboarding to APM, there seems to be a catch 22 of sorts:

  1. Kibana needs the APM server to be available and running
  2. APM server needs the index templates, which Kibana installs

Steps to reproduce:

# docker-compose.yaml
services:
    search:
        image: elasticsearch:8.13.0
        environment:
            - cluster.name=es1
            - discovery.type=single-node
            - xpack.security.enabled=false
            - ES_JAVA_OPTS=-Xms1g -Xmx1g
    kibana:
        image: kibana:8.13.0
        environment:
            - ELASTICSEARCH_HOSTS=http://search:9200
            - ELASTIC_APM_SERVER_URL=http://telemetry:8200
    telemetry:
        image: elastic/apm-server:8.13.0
        command: apm-server -e -E output.elasticsearch.hosts=http://search:9200

When I use "Check APM Server status" on /kibana/app/home#/tutorial/apm

I get:

No APM Server detected. Please make sure it is running and you have updated to 7.0 or higher.

but, from the APM server logs, I keep getting

{"log.level":"error","@timestamp":"2024-05-03T15:04:15.720Z","log.logger":"beater","log.origin":{"function":"github.com/elastic/apm-server/internal/beater.waitReady","file.name":"beater/waitready.go","file.line":62},"message":"precondition 'apm integration installed' failed: error querying Elasticsearch for integration index templates: unexpected HTTP status: 404 Not Found ({\"error\":{\"root_cause\":[{\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [metrics-apm.service_summary.60m] not found\"}],\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [metrics-apm.service_summary.60m] not found\"},\"status\":404}): to remediate, please install the apm integration: https://ela.st/apm-integration-quickstart","service.name":"apm-server","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2024-05-03T15:04:20.720Z","log.logger":"beater","log.origin":{"function":"github.com/elastic/apm-server/internal/beater.waitReady","file.name":"beater/waitready.go","file.line":62},"message":"precondition 'apm integration installed' failed: error querying Elasticsearch for integration index templates: unexpected HTTP status: 404 Not Found ({\"error\":{\"root_cause\":[{\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [metrics-apm.service_summary.60m] not found\"}],\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [metrics-apm.service_summary.60m] not found\"},\"status\":404}): to remediate, please install the apm integration: https://ela.st/apm-integration-quickstart","service.name":"apm-server","ecs.version":"1.6.0"}

From my understanding reading a bunch of ES docs, forums and error messages, Kibana will install these templates (as evidenced in the errors), but Kibana doesn't seem to think the APM server is running, even though it is and it's accessible from Kibana's container:

kibana@e5b74be5283d:~$ env | grep SERVER_URL
ELASTIC_APM_SERVER_URL=http://telemetry:8200
kibana@e5b74be5283d:~$ curl http://telemetry:8200
{
  "build_date": "2024-03-21T15:24:03Z",
  "build_sha": "77c62777a39b536ff3a2a663456a2e0086787552",
  "publish_ready": false,
  "version": "8.13.0"
}

Expected behavior:

Kibana sees the APM server and installs the templates (or installs the templates even if it cannot see the APM server?)

Alternatively, Kibana explains what it tried to do when checking and provide a hint how the APM server could be made available (for example, if it's a configuration issue, what APM server endpoint did it try and how to change that).

Screenshots (if relevant):

Home-Elastic

Errors in browser console (if relevant):

N/A

Provide logs and/or server output (if relevant):

There's no relevant error logs.

dkarlovi avatar May 03 '24 15:05 dkarlovi

Pinging @elastic/apm-ui (Team:APM)

elasticmachine avatar May 14 '24 22:05 elasticmachine

Alternatively, downgrading to APM version 7.17.21 seems to work as a workaround.

razius avatar May 30 '24 08:05 razius

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

elasticmachine avatar May 31 '24 19:05 elasticmachine

@dkarlovi I noticed the same error in my ECK but the error disappeared when I've redeployed my kibana with the required xpack for apm.

peowpeowbangbang avatar Jun 29 '24 23:06 peowpeowbangbang

@peowpeowbangbang you mean if you don't install the plugin via the UI but via deployment config?

dkarlovi avatar Jul 01 '24 08:07 dkarlovi

I have the same problem.

Running like this works:

services:
    search:
        image: elasticsearch:8.16.1
        environment:
            - cluster.name=es1
            - discovery.type=single-node
            - xpack.security.enabled=false
            - ES_JAVA_OPTS=-Xms1g -Xmx1g
        ports:
            - 9200:9200
            - 9300:9300
    kibana:
        image: kibana:8.16.1
        environment:
            - ELASTICSEARCH_HOSTS=http://search:9200
            - ELASTIC_APM_SERVER_URL=http://telemetry:8200
        ports:
            - 5601:5601
    telemetry:
        image: elastic/apm-server:7.17.25
        command: apm-server -e -E output.elasticsearch.hosts=http://search:9200
        ports:
            - 8200:8200

(so the workaround works both with elastic/apm-server:7.17.21 and the latest elastic/apm-server:7.17.25), but running with elastic/apm-server:8.15.5 leads to the server not being detected.

DaGeRe avatar Nov 28 '24 13:11 DaGeRe

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

elasticmachine avatar May 08 '25 21:05 elasticmachine

@smith should this be a logs UX topic? The logs team didn't have any touch points with APM server so far

flash1293 avatar May 11 '25 09:05 flash1293

@flash1293 I put it here because logs team owns the onboarding flows, which I think includes the APM tutorial. You can assign it back to us if that's not the case.

smith avatar May 12 '25 15:05 smith

The infra & Services UX team is still the code owner for the tutorial / APM onboarding page, adding an onboarding label too.

gbamparop avatar May 19 '25 16:05 gbamparop