spilo failing if socket error occurs on 169.254.169.254 on Docker Desktop (WSL)
https://github.com/zalando/spilo/blob/24a62c5814887c84bafe87dc2cf1fb19ff264172/postgres-appliance/scripts/configure_spilo.py#L388
the except block on get_provider() catches ConnectionErrors on the request to 169.254.169.254 (which is part of an unrouteable block in windows, as an APIPA block). Subsequently, Windows throws up a socket error on the requested URL, which manifests as a 403 forbidden reply from inside the spilo container:
root@<spilo_container_id>:/home/postgres# curl -v http://169.254.169.254/
* Trying 169.254.169.254:80...
* Connected to 169.254.169.254 (169.254.169.254) port 80 (#0)
> GET / HTTP/1.1
> Host: 169.254.169.254
> User-Agent: curl/7.81.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 403 connecting to 169.254.169.254:80: connecting to 169.254.169.254:80: dial tcp 169.254.169.254:80: connectex: A socket operation was attempted to an unreachable network.
< Connection: close
<
* Closing connection 0
This leads to silent cascading failures in the configuration script (due to PROVIDER_UNSUPPORTED), eventually leading to this obscure error message and crash when attempting to start a spilo image:
Traceback (most recent call last):
2023-02-16T16:37:37.455847500Z File "/usr/local/bin/patroni", line 33, in <module>
2023-02-16T16:37:37.455857700Z sys.exit(load_entry_point('patroni==1.6.5', 'console_scripts', 'patroni')())
2023-02-16T16:37:37.455865000Z File "/usr/local/lib/python3.6/dist-packages/patroni/__init__.py", line 235, in main
2023-02-16T16:37:37.455871100Z return patroni_main()
2023-02-16T16:37:37.455896800Z File "/usr/local/lib/python3.6/dist-packages/patroni/__init__.py", line 197, in patroni_main
2023-02-16T16:37:37.455903800Z patroni = Patroni(conf)
2023-02-16T16:37:37.455909600Z File "/usr/local/lib/python3.6/dist-packages/patroni/__init__.py", line 32, in __init__
2023-02-16T16:37:37.455915600Z self.dcs = get_dcs(self.config)
2023-02-16T16:37:37.455921100Z File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/__init__.py", line 106, in get_dcs
2023-02-16T16:37:37.455927200Z Available implementations: """ + ', '.join(sorted(set(available_implementations))))
2023-02-16T16:37:37.455933600Z patroni.exceptions.PatroniException: 'Can not find suitable configuration of distributed configuration store\nAvailable implementations: consul, etcd, exhibitor, kubernetes, zookeeper'
2023-02-16T16:37:37.557672800Z /run/service/patroni: finished with code=1 signal=0
2023-02-16T16:37:37.558436700Z /run/service/patroni: sleeping 30 seconds
This issue breaks spilo on all recent versions of Docker Desktop for Windows (at least on our corporate machines). A temporary fix consists in setting SPILO_PROVIDER=local in the docker-compose environment variables. It would be best not to assume that a 403 response indicates necessarily being on a cloud provider.
This issue shows a similar enough message, suggesting the "issue" could come from the translation layer of VPNKit between the VM and the host giving an actual HTTP reply to such a critically failing request.