apm-agent-php icon indicating copy to clipboard operation
apm-agent-php copied to clipboard

Crashing the app after installing agent - SIGSEGV - core dumped

Open minhquankq opened this issue 4 years ago • 1 comments

Describe the bug Crash the app with error

child 17 exited on signal 11 (SIGSEGV - core dumped) after 33.225786 seconds from start

To Reproduce I'm installing apm-agent-php by adding a script to my Dockerfile

RUN apk --no-cache add curl-dev gcc g++ make autoconf && \
  mkdir -p /etc/apm-agent && \
  curl -L "https://github.com/elastic/apm-agent-php/archive/refs/tags/v1.2.tar.gz" > /tmp/apm-agent.tar.gz && \
  tar -zxvf /tmp/apm-agent.tar.gz -C /etc/apm-agent --strip-components 1 && \
  cd /etc/apm-agent/src/ext && \
  phpize && \
  CFLAGS="-std=gnu99" ./configure --enable-elastic_apm && \
  make && \
  make install && \
  rm /tmp/apm-agent.tar.gz

Then enable in php.ini file

extension="elastic_apm.so"
elastic_apm.bootstrap_php_part_file=/etc/apm-agent/src/bootstrap_php_part.php
elastic_apm.environment=__elastic_apm_env__
elastic_apm.enabled=__elastic_apm_enabled__
elastic_apm.server_url=__elastic_apm_server_url__
elastic_apm.service_name=__elastic_apm_service_name__
elastic_apm.secret_token=__elastic_apm_secret__
elastic_apm.transaction_sample_rate=0.1
elastic_apm.transaction_max_spans=100
elastic_apm.server_timeout=50ms

After starting container, the agent can send data to APM server successfully, but after 5-6 times reload, server is crashed with error

 child 17 exited on signal 11 (SIGSEGV - core dumped) after 33.225786 seconds from start

and can not access the website anymore. After 1-2min, I can access the website again, but after 4-5 times of reloading, the same issue occurs again.

The issue is not happen if elastic_apm.enabled=false (installed but not enable)

Note: Also try with other installing approaches by using APK package but got the same issue

RUN curl -L https://github.com/elastic/apm-agent-php/releases/download/v1.2/apm-agent-php_1.2_all.apk > /tmp/apm-agent-php.apk && \
  apk add --allow-untrusted /tmp/apm-agent-php.apk && \
  rm /tmp/apm-agent-php.apk

Does anyone have experience with this issue?

Expected behavior No error

Update: the issue has not happened after disabling XDebug.

minhquankq avatar Sep 20 '21 05:09 minhquankq

Something to note as well, XDebug and apm-agent-php cause a memory leak https://github.com/elastic/apm-agent-php/issues/548

MemoryLeak55 avatar Sep 27 '21 12:09 MemoryLeak55

@minhquankq Did you have a chance to try the latest agent release (v1.10.0) - does it still result in the crash?

SergeyKleyman avatar Sep 19 '23 08:09 SergeyKleyman

Closing it for now - please let us know if it still occurs with the latest agent release.

SergeyKleyman avatar Oct 10 '23 08:10 SergeyKleyman

Thank you for the update, @SergeyKleyman. Unfortunately, I am no longer working on this project and cannot verify the changes.

minhquankq avatar Oct 11 '23 05:10 minhquankq

@SergeyKleyman I can confirm the bug still happens in v1.10

Running docker php:8.2-fpm-alpine3.18 with elastic_apm.log_level_stderr=debug and there is no agent log before SIGSEGV

2023-10-16 08:36:01 [16-Oct-2023 06:36:01] NOTICE: [pool www] 'user' directive is ignored when FPM is not running as root
2023-10-16 08:36:01 [16-Oct-2023 06:36:01] NOTICE: [pool www] 'user' directive is ignored when FPM is not running as root
2023-10-16 08:36:01 [16-Oct-2023 06:36:01] NOTICE: [pool www] 'group' directive is ignored when FPM is not running as root
2023-10-16 08:36:01 [16-Oct-2023 06:36:01] NOTICE: [pool www] 'group' directive is ignored when FPM is not running as root
2023-10-16 08:36:01 [16-Oct-2023 06:36:01] NOTICE: fpm is running, pid 1
2023-10-16 08:36:01 [16-Oct-2023 06:36:01] NOTICE: ready to handle connections
2023-10-16 08:36:11 [16-Oct-2023 06:36:11] WARNING: [pool www] child 7 exited on signal 11 (SIGSEGV) after 10.429581 seconds from start
2023-10-16 08:36:11 [16-Oct-2023 06:36:11] NOTICE: [pool www] child 33 started
2023-10-16 08:36:13 [16-Oct-2023 06:36:13] WARNING: [pool www] child 8 exited on signal 11 (SIGSEGV) after 12.290365 seconds from start
2023-10-16 08:36:13 [16-Oct-2023 06:36:13] NOTICE: [pool www] child 34 started
2023-10-16 08:36:14 [16-Oct-2023 06:36:14] WARNING: [pool www] child 9 exited on signal 11 (SIGSEGV) after 13.388372 seconds from start

In my case it only happens when using preload script (from Symfony) in combination to FPM:

opcache.preload='/app/config/proload.php'

It works fine in CLI.

SmasherHell avatar Oct 16 '23 07:10 SmasherHell