charts icon indicating copy to clipboard operation
charts copied to clipboard

[bitnami/apisix] Container can't be restarted correctly after crash

Open pio2398 opened this issue 8 months ago • 1 comments

Name and Version

bitnami/apisix:5.0.2

What architecture are you using?

amd64

What steps will reproduce the bug?

This is a follow-up to issue #29789. Currently, after a crash, the only working solution is manual container recreation.

Perhaps the chart implementation should remove the socket file on startup? I attempted the following configuration:

      lifecycle:
        postStart:
          exec:
            command:
            - /bin/sh
            - -c
            - |
              sleep 5;
              rm /usr/local/apisix/logs/worker_events.sock

However, this approach didn't work for me. If this cleanup is necessary, maybe it should be incorporated directly into the chart?

Current error after container restart:

2025/06/13 03:57:59 [emerg] 1#1: bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
2025/06/13 03:57:59 [emerg] 1#1: bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
2025/06/13 03:57:59 [emerg] 1#1: bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
2025/06/13 03:57:59 [emerg] 1#1: bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
2025/06/13 03:57:59 [emerg] 1#1: bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
nginx: [emerg] bind() to unix:/usr/local/apisix/logs/worker_events.sock failed (98: Address already in use)
2025/06/13 03:57:59 [emerg] 1#1: still could not bind()
nginx: [emerg] still could not bind()

Crash is probably an upstream problem, but container cleanup may be a problem with the chart.

Are you using any custom parameters or values?

project: default
source:
  repoURL: registry-1.docker.io/bitnamicharts
  targetRevision: 5.0.2
  helm:
    parameters:
      - name: dataPlane.ingress.enabled
        value: 'true'
      - name: etcd.replicaCount
        value: '1'
    valuesObject:
      controlPlane:
        enabled: true
        lifecycleHooks:
          postStart:
            exec:
              command:
              - /bin/sh
              - -c
              - |
                sleep 5;
                rm /usr/local/apisix/logs/worker_events.sock`
      dataPlane:
        config:
          apisix:
            enable_ipv6: true
        service:
          externalIPs:
            - 192.168.9.252
            - 3.3.3.3
            - 1.1.1.1
  chart: apisix
destination:
  server: https://kubernetes.default.svc
  namespace: ingress-apisix

What do you see instead?

The container after the crash should start

pio2398 avatar Jun 13 '25 04:06 pio2398

Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

carrodher avatar Jun 16 '25 07:06 carrodher

Lifecycle hooks didn't work for me either, but the solution proposed by @bradib0y in this comment did.

I have expanded it a bit (to apply it to both the control-plane and the data-plane containers) and have posted it in this comment.

spantaleev avatar Jun 23 '25 10:06 spantaleev

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Jul 09 '25 01:07 github-actions[bot]

Now I am using this config:

project: default
source:
  repoURL: registry-1.docker.io/bitnamicharts
  targetRevision: 5.0.2
  helm:
    parameters:
      - name: dataPlane.ingress.enabled
        value: 'true'
      - name: etcd.replicaCount
        value: '1'
    valuesObject:
      controlPlane:
        enabled: true
        lifecycleHooks:
          postStart:
            exec:
              command:
                - /bin/sh
                - '-c'
                - |
                  sleep 5;
                  rm /usr/local/apisix/logs/worker_events.sock
      dataPlane:
        config:
          apisix:
            enable_ipv6: true
        lifecycleHooks:
          postStart:
            exec:
              command:
                - /bin/sh
                - '-c'
                - |
                  sleep 5;
                  rm /usr/local/apisix/logs/worker_events.sock
        service:
          externalIPs:
            - 1.2.3.4
          ipFamilies:
            - IPv6
            - IPv4
          ipFamilyPolicy: PreferDualStack
  chart: apisix
destination:
  server: https://kubernetes.default.svc
  namespace: ingress-apisix

and seem to be working.

I don't know which solution is better. My or provided by @spantaleev

pio2398 avatar Jul 09 '25 05:07 pio2398

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Jul 26 '25 01:07 github-actions[bot]

Still valid

pio2398 avatar Jul 26 '25 04:07 pio2398