SSL certificate verify failed even if PREFECT_API_TLS_INSECURE_SKIP_VERIFY=true
Bug summary
It seems the value of PREFECT_API_TLS_INSECURE_SKIP_VERIFY is not properly propagated to websockets connect.
Reproduced with this code, PREFECT_API_TLS_INSECURE_SKIP_VERIFY=true, and connecting to a secure prefect server under nginx (with self signed certificate):
import asyncio
from prefect.events.clients import PrefectEventsClient
async def main():
async with PrefectEventsClient() as client:
print(f"Connected to: {client._events_socket_url}")
pong = await client._websocket.ping()
pong_time = await pong
print(f"Response received in: {pong_time}")
if __name__ == '__main__':
asyncio.run(main())
The following error is prompted:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "<stdin>", line 2, in main
File "/usr/local/lib/python3.11/dist-packages/prefect/events/clients.py", line 270, in __aenter__
await self._reconnect()
File "/usr/local/lib/python3.11/dist-packages/prefect/events/clients.py", line 288, in _reconnect
self._websocket = await self._connect.__aenter__()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/websockets/legacy/client.py", line 629, in __aenter__
return await self
^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/websockets/legacy/client.py", line 647, in __await_impl_timeout__
return await self.__await_impl__()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/websockets/legacy/client.py", line 651, in __await_impl__
_transport, _protocol = await self._create_connection()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 1113, in create_connection
transport, protocol = await self._create_connection_transport(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 1146, in _create_connection_transport
await waiter
File "/usr/lib/python3.11/asyncio/sslproto.py", line 578, in _on_handshake_complete
raise handshake_exc
File "/usr/lib/python3.11/asyncio/sslproto.py", line 560, in _do_handshake
self._sslobj.do_handshake()
File "/usr/lib/python3.11/ssl.py", line 979, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006)
Version info (prefect version output)
Version: 3.0.0
API version: 0.8.4
Python version: 3.11.9
Git commit: c40d069d
Built: Tue, Sep 3, 2024 11:13 AM
OS/Arch: linux/x86_64
Profile: ephemeral
Server type: server
Pydantic version: 2.8.2
Integrations:
prefect-docker: 0.6.0
Additional context
No response
same here +1
@davidesba @syakesaba
I had the same issue and I did not want to set PREFECT_API_TLS_INSECURE_SKIP_VERIFY=true, but I solved it by specifying the certificate file and not using PREFECT_API_TLS_INSECURE_SKIP_VERIFY.
I assume you are running a worker, which uses websockets.
What worked for me is to run the worker/program with the environment variable.
SSL_CERT_FILE=./cert/ssl.crt PREFECT_API_SSL_CERT_FILE:=./cert/ssl.crt prefect worker start --pool my-pool
Some notes:
-
SSL_CERT_FILEis an environment variable not specific to prefect but it is used by other variables. -
PREFECT_API_SSL_CERT_FILEis specific to prefect and is not used all the time. I found out that you should useSSL_CERT_FILEbut I used both to make sure everything works. - I am speculating that
PREFECT_API_SSL_CERT_FILEdefaults toSSL_CERT_FILEif you don't set it, - You can set
PREFECT_API_SSL_CERT_FILEvia prefect config set PREFECT_API_SSL_CERT_FILE=... - I am not sure which combination of the above is required so what I did is I set both with prefect config and the environment variables, but I think setting the environment variables should be enough.
- If you are using docker images in your deployments/workers you should not specify the full path in the certificates above, but rather relative paths unless the full path you are using outside of docker is the same as the one inside the docker. For example I have a directory called
certand it has all the certificates/keys so setting a relative path with the root directory being the project's root directory works, otherwise you will get errors like "Certificate not found in path" etc.- This means that if you are using some kind of process manager like pm2 you need to make sure that
cwdis set to the project's directory so that the relative paths don't get messed up.
- This means that if you are using some kind of process manager like pm2 you need to make sure that
@tchar Thank you great info.
I usually use helm to deploy prefect-worker and I tried your tricks.
# create configMap
kubectl create configmap server-cert --from-file=ssl.crt --dry-run=client -o yaml > server-cert.yaml
kubectl apply -f server-cert.yaml
# deploy with helm
helm upgrade ... --set worker.extraEnvVars[0].name=SSL_CERT_FILE --set-string worker.extraEnvVars[0].value=/ssl/ssl.crt --set worker.extraEnvVars[1].name=PREFECT_API_SSL_CERT_FILE --set-string worker.extraEnvVars[1].value=/ssl/ssl.crt --set worker.extraVolumeMounts[0].name=server-cert --set-string worker.extraVolumeMounts[0].mountPath=/ssl/ssl.crt --set-string worker.extraVolumeMounts[0].subPath=ssl.crt --set worker.extraVolumes[0].name=server-cert --set-string worker.extraVolumes[0].configMap.name=server-cert
I got the streaming log-event message for my Runs!
But there was not Runs-Graph at the top of a Runs window.
fig
The warning message for verify ssl cert of wss:// remain at the worker's stdout.
WARNING | prefect.events.clients - Unable to connect to wss://myhost.com/api/events/in . Please check your network settings to ensure websocket connections to the API are
allowed. Otherwise event data (including task run data) may be lost. Reason: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006). Set PREFECT_DEBUG_MODE=1 to
@tchar Thank you great info.
I usually use helm to deploy prefect-worker and I tried your tricks.
create configMap
kubectl create configmap server-cert --from-file=ssl.crt --dry-run=client -o yaml > server-cert.yaml kubectl apply -f server-cert.yaml
deploy with helm
helm upgrade ... --set worker.extraEnvVars[0].name=SSL_CERT_FILE --set-string worker.extraEnvVars[0].value=/ssl/ssl.crt --set worker.extraEnvVars[1].name=PREFECT_API_SSL_CERT_FILE --set-string worker.extraEnvVars[1].value=/ssl/ssl.crt --set worker.extraVolumeMounts[0].name=server-cert --set-string worker.extraVolumeMounts[0].mountPath=/ssl/ssl.crt --set-string worker.extraVolumeMounts[0].subPath=ssl.crt --set worker.extraVolumes[0].name=server-cert --set-string worker.extraVolumes[0].configMap.name=server-cert I got the streaming log-event message for my Runs!
But there was not Runs-Graph at the top of a Runs window.
fig
The warning message for verify ssl cert of wss:// remain at the worker's stdout.
WARNING | prefect.events.clients - Unable to connect to wss://myhost.com/api/events/in . Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006). Set PREFECT_DEBUG_MODE=1 to
Hey @syakesaba,
I forgot to mention that I have updated my prefect server script to use SSL. I did not want to use a reverse proxy so I ended up modifying the script that starts the server venv/lib/pythonx.xx/site-packages/prefect/cli/server.py:start (see below). The settings I mentioned above and the code modifications below are for my setup which is prefect server on localhost, prefect worker also on localhost, self-signed cert, docker. The reason I mention these is because this whole procedure took a bit to set up and I figured out that at some point connections were getting rejected due to the self-signed cert. I think the SSL_CERT_FILE was to solve this, but I tried many things for a couple of days before making it work.
After your answer I realized that I had to start the server using the SSL for my answer above to work. I could not find a way to start the server itself with SSL with the current code (unless behind proxy) so I ended up modifying it. You could just copy and modify the start function and put it in your source, fork prefect, or just start the prefect server via its module (python -m unicorn ...).
Generally getting it to work was a bit tedious. I even had to experiment with how I generate the keys (cause localhost and 127.0.0.1 etc).
If I were you I would try to get this to work on localhost/127.0.0.1 with docker first. Then try it on a production env and then try kubernetes. A lot of reasons for this to fail when you try directly on production.
Here is the script to generate keys (I am pasting it cause it took me a while with addext to figure out how to make it work.
# Step 1: Generate the Private Key
openssl genrsa -out localhost.key 2048
# Step 2: Generate the Self-Signed Certificate with SANs (localhost and 127.0.0.1)
openssl req \
-x509 \
-nodes \
-new \
-key localhost.key \
-sha256 \
-days 365 \
-out localhost.crt \
-subj "/CN=localhost" \
-addext "subjectAltName=DNS:localhost,IP:127.0.0.1"
# Step 3: Verify the Certificate to Ensure Both SANs Are Included
openssl x509 -in localhost.crt -text -noout
Run it with
prefect server start --ssl-keyfile ..., ssl-certfile ....
I'm currently evaluating prefect and also have trouble getting WSS running for the Worker. Normal https Connections to the Server seem to work, but according to the logs they are unreliable.
Is there any way to get WSS running with prefect server without modifying the prefect code?
We are currently looking into what changes we would need to make for this setup to be simplified; as @tchar speculated, PREFECT_API_SSL_CERT_FILE does default to SSL_CERT_FILE.
It seems that to make the whole process come together we need two new settings: PREFECT_SERVER_API_SSL_CERT_FILE and PREFECT_SERVER_API_SSL_KEY_FILE that pass their values off to the appropriate uvicorn CLI flags when starting the webserver.
Does that sound like it covers the issue well?
Hi @cicdw ,
as mentioned in #16871 the problem in my case was actually something different:
I didn't know the API URL in the worker needs to be set using the format http://<prefect-server-service-name>.<namespace>.svc.cluster.local:<prefect-server-port>/api when deployed in kubernetes and used the public URL instead, which lead to the certificate verification issues.
Since I had PREFECT_API_TLS_INSECURE_SKIP_VERIFY set to true, HTTPS Connections worked and WSS Connections didn't as they don't respect this setting.
Thank you @cicdw !
