nginx-proxy-manager icon indicating copy to clipboard operation
nginx-proxy-manager copied to clipboard

"Random" 502 errors

Open ValentinAUCLERC opened this issue 2 years ago • 30 comments

Checklist

  • Have you pulled and found the error with jc21/nginx-proxy-manager:latest docker image?
    • Yes
  • Are you sure you're not using someone else's docker image?
    • Yes
  • Have you searched for similar issues (both open and closed)?
    • Yes

Describe the bug I'm using NPM with multiple images (portainer, httpd ...) and I made the NPM docker-compose join each sub-network for my other docker-compose projects.

When i'm accessing directly, let's say my httpd project with exposed port, no problem, even if i'm mashing F5 button. When i'm doing the same through the NPM proxy, I have "random" 502 errors, I would say nearly 5-10% of the time, with the same url.

Meanwhile no problem doing the same with the direct access.

Nginx Proxy Manager Version v.2.9.19

To Reproduce

  • Make a new network (including ipv6)
  • Make a docker-compose joining this network
  • Add the same network to NPM docker-compose (in addition to other networks)
  • Make proxy using the hostname
  • Mash the F5 button
  • Sometime (most of the time) it works, sometime it doesn't
  • No logs appearing on npm docker logs

Operating System Docker on Ubuntu Server

ValentinAUCLERC avatar Feb 12 '23 14:02 ValentinAUCLERC

Here is my docker-compose.yml for NPM

version: '2.4'
services:
  app:
    container_name: nginxproxymanager
    image: 'jc21/nginx-proxy-manager:latest'
    restart: unless-stopped
    environment:
      ENABLE_IPV6: true
    ports:
      - '80:80'
      - '85:81'
      - '443:443'
    volumes:
      - ./data:/data
      - ./letsencrypt:/etc/letsencrypt
    networks:
      - p1
      - p2
      - p3
networks:
  p1:
    external: true
  p2:
    external: true
  p3:
    external: true

ValentinAUCLERC avatar Feb 12 '23 14:02 ValentinAUCLERC

More info :

I just looked at my data/logs/proxy-host-7_error.log and got this message :

023/02/12 21:16:09 [error] 782#782: *176817 [myhost] could not be resolved (3: Host not found) But at the next f5... it finds it

ValentinAUCLERC avatar Feb 12 '23 21:02 ValentinAUCLERC

Hello, I also have the random problem with the 502 errors. I have that since the last update on all proxies. I can just press F5 a few times or wait. That's pretty annoying :-(

Before the update everything was working fine!!

Errlog:

2023/02/12 21:52:37 [error] 681#681: *1252331 upstream prematurely closed connection while reading response header from upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 682#682: *1252334 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 682#682: *1252333 upstream prematurely closed connection while reading response header from upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 681#681: *1252338 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 682#682: *1252332 upstream prematurely closed connection while reading response header from upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 682#682: *1252335 upstream prematurely closed connection while reading response header from upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 682#682: *1252336 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 681#681: *1252345 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 681#681: *1252347 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:38 [error] 681#681: *1252349 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:38 [error] 681#681: *1252351 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:38 [error] 681#681: *1252353 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:38 [error] 681#681: *1252355 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:39 [error] 681#681: *1252357 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:39 [error] 681#681: *1252359 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:39 [error] 682#682: *1252367 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:39 [error] 681#681: *1252361 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de"

LukeSkywalker993 avatar Feb 12 '23 21:02 LukeSkywalker993

Hello, I also have the random problem with the 502 errors. I have that since the last update on all proxies. I can just press F5 a few times or wait. That's pretty annoying :-(

Before the update everything was working fine!!

Errlog:

2023/02/12 21:52:37 [error] 681#681: *1252331 upstream prematurely closed connection while reading response header from upstream, client: 172.21.0.1, server: symcon.domain.de, request: "POST /hook/ipsviewconnect/api/ HTTP/1.1", upstream: "https://192.168.26.5:3778/hook/ipsviewconnect/api/", host: "symcon.domain.de" 2023/02/12 21:52:37 [error] 682#682: "

I had this problem too (and mixed hostname resolution... when i was trying to access host1 it was showing host2...) but disabling and enabling again the host was a workaround for me

ValentinAUCLERC avatar Feb 13 '23 09:02 ValentinAUCLERC

Having the same issue. NPM worked fine when I was using docker, but ever since I switched to podman for better rootless containers, I keep getting random 502s that go away after a refresh or two. Unlike the others that have this issue, I'm not getting any error logs

frap129 avatar Mar 23 '23 20:03 frap129

Hi! I have the same problem on fresh install using this compose file:

version: '3.7'

networks:
  nginx-proxy-manager:
    external: true

services:
  npm:
    image: 'jc21/nginx-proxy-manager:latest'
    container_name: nginx-proxy-manager
    restart: unless-stopped
    ports:
      - '80:80'
      - '43013:81'
      - '443:443'
    networks:
      - nginx-proxy-manager
    depends_on:
      - npm-db
    environment:
      DB_MYSQL_HOST: npm-db
      DB_MYSQL_PORT: 3306
      DB_MYSQL_USER: npm
      DB_MYSQL_PASSWORD: npm
      DB_MYSQL_NAME: npm
      # Uncomment this if IPv6 is not enabled on your host
      #DISABLE_IPV6: 'true'
    volumes:
      - ./data:/data
      - ./letsencrypt:/etc/letsencrypt

  npm-db:
    image: 'jc21/mariadb-aria:latest'
    restart: unless-stopped
    environment:
      MYSQL_ROOT_PASSWORD: npm
      MYSQL_DATABASE: npm
      MYSQL_USER: npm
      MYSQL_PASSWORD: npm
    networks:
      - nginx-proxy-manager
    volumes:
      - ./data/mysql:/var/lib/mysql

Any ideas?

mariadb logs:

MySQL init process done. Ready for start up.

exec /usr/bin/mysqld --user=mysql --console --skip-name-resolve --skip-networking=0
2023-03-29 17:25:33 0 [Note] /usr/bin/mysqld (mysqld 10.4.15-MariaDB) starting as process 1 ...
2023-03-29 17:25:33 0 [ERROR] mysqld: File '/var/lib/mysql/aria_log_control' not found (Errcode: 13 "Permission denied")
2023-03-29 17:25:33 0 [ERROR] mysqld: Got error 'Can't open file' when trying to use aria control file '/var/lib/mysql/aria_log_control'
2023-03-29 17:25:33 0 [ERROR] Plugin 'Aria' init function returned error.
2023-03-29 17:25:33 0 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
2023-03-29 17:25:33 0 [Note] Plugin 'InnoDB' is disabled.
2023-03-29 17:25:33 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-03-29 17:25:33 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
2023-03-29 17:25:33 0 [ERROR] Failed to initialize plugins.
2023-03-29 17:25:33 0 [ERROR] Aborting
[i] pre-init.d - processing /scripts/pre-init.d/01_secret-init.sh
[i] mysqld already present, skipping creation
[i] MySQL directory already present, skipping creation
2023-03-29 17:25:34 0 [Note] /usr/bin/mysqld (mysqld 10.4.15-MariaDB) starting as process 1 ...
2023-03-29 17:25:34 0 [Note] Plugin 'InnoDB' is disabled.
2023-03-29 17:25:34 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-03-29 17:25:34 0 [Note] Server socket created on IP: '::'.
2023-03-29 17:25:34 0 [Warning] 'user' entry '@3726bb8bb89f' ignored in --skip-name-resolve mode.
2023-03-29 17:25:34 0 [Warning] 'proxies_priv' entry '@% root@3726bb8bb89f' ignored in --skip-name-resolve mode.
2023-03-29 17:25:34 0 [Note] Reading of all Master_info entries succeeded
2023-03-29 17:25:34 0 [Note] Added new Master_info '' to hash table
2023-03-29 17:25:34 0 [Note] /usr/bin/mysqld: ready for connections.
Version: '10.4.15-MariaDB'  socket: '/run/mysqld/mysqld.sock'  port: 3306  MariaDB Server
2023-03-29 17:25:39 3 [Warning] Aborted connection 3 to db: 'unconnected' user: 'unauthenticated' host: '172.19.0.3' (This connection closed normally without authentication)
2023-03-29 17:25:40 4 [Warning] Aborted connection 4 to db: 'unconnected' user: 'unauthenticated' host: '172.19.0.3' (This connection closed normally without authentication)
2023-03-29 17:25:41 5 [Warning] Aborted connection 5 to db: 'unconnected' user: 'unauthenticated' host: '172.19.0.3' (This connection closed normally without authentication)
2023-03-29 17:25:42 6 [Warning] Aborted connection 6 to db: 'unconnected' user: 'unauthenticated' host: '172.19.0.3' (This connection closed normally without authentication)
2023-03-29 17:25:43 7 [Warning] Aborted connection 7 to db: 'unconnected' user: 'unauthenticated' host: '172.19.0.3' (This connection closed normally without authentication)

NPM logs:

❯ Starting nginx ...
❯ Starting backend ...
s6-rc: info: service frontend successfully started
s6-rc: info: service nginx successfully started
s6-rc: info: service backend successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
[3/29/2023] [5:25:33 PM] [Global   ] › ℹ  info      Using MySQL configuration
[3/29/2023] [5:25:33 PM] [Global   ] › ℹ  info      Creating a new JWT key pair...
[3/29/2023] [5:25:38 PM] [Global   ] › ℹ  info      Wrote JWT key pair to config file: /data/keys.json
[3/29/2023] [5:25:39 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:40 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:41 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:42 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:43 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:44 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:45 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0
[3/29/2023] [5:25:46 PM] [Global   ] › ✖  error     Packets out of order. Got: 1 Expected: 0

bvn13 avatar Mar 29 '23 18:03 bvn13

Same problem, using version 2.9.22 with Podman. Disabling and re-enabling did not work.

Javierkaiser avatar Mar 30 '23 11:03 Javierkaiser

for fresh installation prefer to use 2.9.22 version. it works.

bvn13 avatar Apr 01 '23 18:04 bvn13

Just to add to the discussion, I also experienced this on the latest version but managed to work around it by creating a shared bridge network on podman(so I can set ip ranges and stuff), attaching the containers to it and then setting static IPs on the containers. With that I can point nginx-proxy-manager to the static ip of the container instead of the hostname and it seems to work reliably.

edit: FWIW I'm using podman 4.2.0 on Rocky Linux 9.1 and configuring everything with ansible so I'm not sure how applicable this is to everyone else.

maowohl avatar Apr 12 '23 17:04 maowohl

Just to add to the discussion, I also experienced this on the latest version but managed to work around it by creating a shared bridge network on podman(so I can set ip ranges and stuff), attaching the containers to it and then setting static IPs on the containers. With that I can point nginx-proxy-manager to the static ip of the container instead of the hostname and it seems to work reliably.

edit: FWIW I'm using podman 4.2.0 on Rocky Linux 9.1 and configuring everything with ansible so I'm not sure how applicable this is to everyone else.

Interesting, I'm also using podman. Are you running rootless? I'm wondering if it's some weirdness related to rootless networking

frap129 avatar Apr 12 '23 17:04 frap129

Just to add to the discussion, I also experienced this on the latest version but managed to work around it by creating a shared bridge network on podman(so I can set ip ranges and stuff), attaching the containers to it and then setting static IPs on the containers. With that I can point nginx-proxy-manager to the static ip of the container instead of the hostname and it seems to work reliably. edit: FWIW I'm using podman 4.2.0 on Rocky Linux 9.1 and configuring everything with ansible so I'm not sure how applicable this is to everyone else.

Interesting, I'm also using podman. Are you running rootless? I'm wondering if it's some weirdness related to rootless networking

This doesn't seem to be a issue of rootless network. I'm also using Podman but in root permission. NPM works perfectly when using internal IP address instead of hostnames. The problem of occasional 502 errors occur only for containers using hostnames.

ootrey avatar Apr 14 '23 13:04 ootrey

I'm also experiencing this issue.

docker.io/jc21/nginx-proxy-manager:2.9.22 podman version 4.4.4 OpenSUSE Leap 15.4 kernel 5.14.21-150400.24.60-default cni-plugin-dnsname-1.3.1 RPM installed to enable podman DNS functionality

Created a bridge network with default settings and attached containers for nginx-proxy-manager, grafana and prometheus.

I'd say approx 10-20% of requests fail with a 502, and inspecting the nginx-proxy-manager proxy host error logs shows:

2023/04/24 14:10:46 [error] 713#713: *324 grafana.dns.podman could not be resolved (3: Host not found), client: 172.26.0.131, server: monitoring.redacted.com, request: "GET /api/live/ws HTTP/1.1", host: "monitoring.redacted.com"

However from within the nginx-proxy-manager container itself, nslookup can resolve it every time

$ nslookup grafana.dns.podman
Server:		10.89.0.1
Address:	10.89.0.1:53

Name:	grafana.dns.podman
Address: 10.89.0.2

and this call to grafana's health endpoint succeeds 100% of the time:

$ wget -O - grafana.dns.podman:3000/api/health
Connecting to grafana.dns.podman:3000 (10.89.0.2:3000)
writing to stdout
{
  "commit": "4add91f03d",
  "database": "ok",
  "version": "9.4.7"
-                    100% |***********************************************************************************************************************************************************************************************|    70  0:00:00 ETA
written to stdout

I have a second proxy host configured to a prometheus container, and this also suffers the same problem. I've also tried running all containers as privileged but the same issue occurs. An obvious workaround is to use static container IPs and set these as the forward ip instead of the hostname but it's not ideal.

calcium90 avatar Apr 24 '23 14:04 calcium90

Same problem here with fresh 2.10.2 docker install. The problem is solved as soon as I add root domain to one of proxy hosts - as soon as I remove root domain and leave subdomains only I get 502 error image

So basically I cannot use NPM with only subdomains

cpuks avatar Apr 26 '23 19:04 cpuks

I'm also experiencing the issues described when using Podman (on plain Docker it worked fine).

Changing the reverse proxy to Caddy instead of Nginx Proxy Manager completely solved the issues, so my guess is that it is due to the DNS resolver NPM is using.

jbmorgado avatar May 19 '23 10:05 jbmorgado

Just to add to the discussion, I also experienced this on the latest version but managed to work around it by creating a shared bridge network on podman(so I can set ip ranges and stuff), attaching the containers to it and then setting static IPs on the containers. With that I can point nginx-proxy-manager to the static ip of the container instead of the hostname and it seems to work reliably. edit: FWIW I'm using podman 4.2.0 on Rocky Linux 9.1 and configuring everything with ansible so I'm not sure how applicable this is to everyone else.

Interesting, I'm also using podman. Are you running rootless? I'm wondering if it's some weirdness related to rootless networking

This doesn't seem to be a issue of rootless network. I'm also using Podman but in root permission. NPM works perfectly when using internal IP address instead of hostnames. The problem of occasional 502 errors occur only for containers using hostnames.

I am also experiencing issues when using hostnames. I just switched over to Podman, but specifying subuid and subgid and running under root. When switching to using IPs, the problem seems to have gone away.

noelmiller avatar Jun 04 '23 06:06 noelmiller

Since I've pulled the last version I begun to get the random 502. I don;t known exactly why, but the issue is with the domain name resolution.

pablodgonzalez avatar Jul 12 '23 00:07 pablodgonzalez

Pretty sure this issue and the hosts issue (#2197 ) are related, I've added an interim solution to the other ticket (podman specific) I've not experienced this under docker myself.

fuzzyfox avatar Jul 19 '23 14:07 fuzzyfox

Have Portainer Business Edition 2.19.4 and jc21/nginx-proxy-manager:2.10.4. Same issue

U.P.D. method provided by @fuzzyfox actually seems to work.

MrMasrozYTLIVE avatar Dec 24 '23 16:12 MrMasrozYTLIVE

Pretty sure this issue and the hosts issue (#2197 ) are related, I've added an interim solution to the other ticket (podman specific) I've not experienced this under docker myself.

I just tried a version of this solution with Docker: starting with a default Docker configuration (i.e. no /etc/docker/daemon.json file) at some version upgrade in the past few months I started getting a lot of 502 caused by host not found errors. Thinking that Docker and resolved hosed something I created the daemon.json file with 2 DNS entries... but that still failed to resolve hostnames very often (half the times?). Taking a hint from #2197 I changed the file to contain only my local router's DNS... and that fixed the problem.

RobertoMaurizzi avatar Feb 11 '24 08:02 RobertoMaurizzi

I encounter the random 502 errors also.

No Podman, just plain docker. However it seems they only show up when setting: network_mode: host in my docker-compose.yml

Using the 'latest' image

EDIT: Forgot to mention that the 502 is what I see in the different (firefox, chrome, ... also both in-private and not) browsers. In the logs I find 404 host can not be resolved. My first idea was that there was some 'DNS domain clash' (if that even exists). As publicly served hosts as blah.domain.com resolve internally to a local IP (192.168.xxx.xxx range) So I redeployed internal DNS so that the hosts resolve to another domain to rule that out. => But no success

DragonPi avatar Jul 01 '24 14:07 DragonPi

Same issue here. Until further news, I will switch to pointing to IPs directly and avoid going through my DNS.

(BTW: Its adguard, not sure if anyone is experiencing the same issues with pihole? I doubt its related to the DNS, seems to be more of NPM's, but just in case)

CarlesLlobet avatar Nov 03 '24 09:11 CarlesLlobet

Long time ago. Update fixed all my problems

LukeSkywalker993 avatar Nov 03 '24 10:11 LukeSkywalker993

Long time ago. Update fixed all my problems

@LukeSkywalker993 Update to NPM? I'm running latest version (2.12.1) and experiencing this issues when using hostnames instead of IPs

CarlesLlobet avatar Nov 03 '24 17:11 CarlesLlobet

@CarlesLlobet did you try defining a single DNS in your /etc/docker/ daemon.json file as I did above?

RobertoMaurizzi avatar Nov 05 '24 14:11 RobertoMaurizzi

@CarlesLlobet did you try defining a single DNS in your /etc/docker/ daemon.json file as I did above?

@RobertoMaurizzi I don't have docker, but NPM directly over an LxC. More details here.

However I don't think your approach is ideal, since I it relies on a unique DNS as single point of failure, and manually hardcoded. If that DNS fails or falls, your whole NPM doesn't resolve anything. If I use DNS, I want to be able to use the DNS shared by DHCP, and in case my DNS falls for some reason, have a second fallback DNS in place.

Otherwise I prefer to just manually input the IPs (although its far from ideal since I have to keep it updated with any changes, so I have to update 2 places (DNs + NPM)). At least this way if my DNS falls, NPM and all my services behind it still work.

CarlesLlobet avatar Nov 05 '24 14:11 CarlesLlobet

@CarlesLlobet the main point here is that the problem isn't in NPM but in the system that manages the DNS for the containers, for at least PodMan and Docker and they both use containerd that in turn uses lxc: likely there's a problem there. Your options are to set up multiple DNS servers on a shared ip, use a VM, etc... in general: find a way to run it that doesn't involve a broken DNS for NPM.

RobertoMaurizzi avatar Nov 05 '24 14:11 RobertoMaurizzi

the problem isn't in NPM but in the system that manages the DNS for the containers

@RobertoMaurizzi I agree the issue might rely on Docker side rather than NPM.

However at the same time, the only official install option that NPM supports/suggests is docker. So I do think that it is fair that people might want to set up their Hosts with hostnames instead of IPs, without having to apply patches to the docker DNS config. Furthermore this solution you suggest would affect ALL your docker containers, not only NPM.

I am not sure at what the best solution / course of action would be from NPM's side, either offering an alternate installation method that doesn't have this issue/bug with DNS, or internally doing the request twice if the first DNS resolution fails, but I do think it could be improved.

As per what I've read from others and my own use case, when NPM replied with a 502, refreshing the page a second time would always point correctly to the destination that second time. So maybe just "trying twice" in case the first DNS resolution fails, solves the issue.. And shouldn't add too much overhead / lag to the response.

But Im not an expert on NPM implementation. Do you think that'd be feasible? 🤔

CarlesLlobet avatar Nov 05 '24 15:11 CarlesLlobet

@CarlesLlobet the problem is that if you have 3 DNS entries, it becomes 1 good every three (that was my initial problem, with docker picking up my system's DNS entries in /etc/resolv.conf I'm also not heavily using containers that need to resolve DNS entries so I'm not sure if this problem affects other projects... I didn't see it anywhere else but I can't say I have enough testing to say that. It would be interesting to take a look at how NPM approaches DNS resolution and to get a better look at any returned error, but I'm not a Node/JS expert 😅

RobertoMaurizzi avatar Nov 05 '24 15:11 RobertoMaurizzi

I had exactly the same issue - some of my docker apps got a bad gateway error behind NPM every 3rd or 4th request. I figured out that the affected docker containers were assigned to the default docker bridge network (system).

After creating a separate docker bridge network for those containers, the bad gateway issues were completely gone. So no issue on NPM side in my case

BenS89 avatar Dec 22 '24 01:12 BenS89

Issue persisting, NPM 2.12.3, root Podman 4.3.1, Portainer CE 2.27.6.

The way I see it, this issue is caused by the way NGINX takes the hosts upstream nameserver into /etc/nginx/conf.d/include/resolvers.con, instead of trusting the Docker/Podman resolver to handle it.

As mentioned in #2197, this is fixable by just removing the hosts nameserver, which is not that easy, as /etc is not one of the mounted volumes in most compose configs.

Considering that other people have reported that Caddy and other Proxies are able to handle this better, that there is no easy in-UI way to perhaps define a custom resolver (as described in another NGINX issue), let alone how unintuitive this would be, I do think this is at the end of the day an NPM issue, as it drives peopler away from using it.

Saying "Just static IP all your containers" or "Containers are not that important to me" is not helpful imho.

varXX404 avatar Jun 07 '25 15:06 varXX404