graphql-engine icon indicating copy to clipboard operation
graphql-engine copied to clipboard

Segmentation fault. Graphql-engine docker container fails at startup

Open Ahsalis opened this issue 1 year ago • 2 comments

Version Information

Server Version: v2.38.0

Environment

AWS cloud; Instance OS: AlmaLinux release 9.4 (Seafoam Ocelot)

What is the current behaviour?

Graphql-engine container not starting up.

What is the expected behaviour?

The container should be healthy.

How to reproduce the issue?

  1. docker-compose.yml file. Copied from here.
version: "3.6"
services:
  postgres:
    image: postgres:15
    restart: always
    volumes:
      - db_data:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: postgrespassword
  graphql-engine:
    image: hasura/graphql-engine:v2.38.0
    ports:
      - "8080:8080"
    restart: always
    environment:
      ## postgres database to store Hasura metadata
      HASURA_GRAPHQL_METADATA_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      ## this env var can be used to add the above postgres database to Hasura as a data source. this can be removed/updated based on your needs
      PG_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      ## enable the console served by server
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true" # set to "false" to disable console
      ## enable debugging mode. It is recommended to disable this in production
      HASURA_GRAPHQL_DEV_MODE: "true"
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
      ## uncomment next line to run console offline (i.e load console assets from server instead of CDN)
      # HASURA_GRAPHQL_CONSOLE_ASSETS_DIR: /srv/console-assets
      ## uncomment next line to set an admin secret
      # HASURA_GRAPHQL_ADMIN_SECRET: myadminsecretkey
      HASURA_GRAPHQL_METADATA_DEFAULTS: '{"backend_configs":{"dataconnector":{"athena":{"uri":"http://data-connector-agent:8081/api/v1/athena"},"mariadb":{"uri":"http://data-connector-agent:8081/api/v1/mariadb"},"mysql8":{"uri":"http://data-connector-agent:8081/api/v1/mysql"},"oracle":{"uri":"http://data-connector-agent:8081/api/v1/oracle"},"snowflake":{"uri":"http://data-connector-agent:8081/api/v1/snowflake"}}}}'
    depends_on:
      data-connector-agent:
        condition: service_healthy
  data-connector-agent:
    image: hasura/graphql-data-connector:v2.38.0
    restart: always
    ports:
      - 8081:8081
    environment:
      QUARKUS_LOG_LEVEL: ERROR # FATAL, ERROR, WARN, INFO, DEBUG, TRACE
      ## https://quarkus.io/guides/opentelemetry#configuration-reference
      QUARKUS_OPENTELEMETRY_ENABLED: "false"
      ## QUARKUS_OPENTELEMETRY_TRACER_EXPORTER_OTLP_ENDPOINT: http://jaeger:4317
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8081/api/v1/athena/health"]
      interval: 5s
      timeout: 10s
      retries: 5
      start_period: 5s
volumes:
  db_data:
  1. Run docker compose up
  2. Graphql-engine should be failing.

Screenshots or Screencast

image

Any possible solutions/workarounds you're aware of?

I have tried running the same compose file on a ubuntu machine and it works fine. This issue seems to be only coming up on AlmaLinux OS (Tried it on a vm in vmware and in an instance on the cloud).

Keywords

segmentation fault

Ahsalis avatar Jun 01 '24 18:06 Ahsalis

That's very strange. My guess is that there's something incompatible with either your kernel version (RHEL, and therefore AlmaLinux, is on a pretty old version) or your Docker Engine.

Have you looked in the system logs? Might be worth running dmesg immediately after you see a crash.

SamirTalwar avatar Jun 03 '24 12:06 SamirTalwar

I'm getting this with graphql-engine v2.31.0-ce on Linux ip-172-31-70-77.ec2.internal 5.14.0-503.15.1.el9_5.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Nov 14 15:45:31 EST 2024 x86_64 x86_64 x86_64 GNU/Linux -- it's an EC2 Xeon 8375C VM.

Yes dmesg does confess a little,

427164.071039] graphql-engine[1626180]: segfault at 7f0e610ad257 ip 00007f0e610ad257 sp 00007fff99e1f820 error 15 in graphql-engine[7f0e5f8a5000+1809000] likely on CPU 3 (core 3, socket 0)
[427164.071703] Code: ff d5 59 5e 5f 5d 6a 05 5a 6a 0a 58 0f 05 41 ff e5 5d e8 3c ff ff ff 2f 70 72 6f 63 2f 73 65 6c 66 2f 65 78 65 00 00 01 00 00 <e8> 4a 00 00 00 83 f9 49 75 44 53 57 48 8d 4c 37 fd 5e 56 5b eb 2f

mnp avatar Jan 22 '25 18:01 mnp