reflex
reflex copied to clipboard
issue with db migrate with k8s deployment on v 0.4.5
A clear and concise description of what the bug is.
- I deploy in v0.4.4:
- everything works. The app is usable.
- I upgrade the line on requirements.txt from
reflex==0.4.4toreflex==0.4.5.- I make a submission on the page that creates a new record in a table (using ORM) ... I get the message "an error occurred. See logs for details". In the logs, I see the message you would expect to see if there were migrations that needed to be run, but were yet to be run.
To Reproduce Steps to reproduce the behavior:
- Code/Link to Repo:
Proprietary component
Expected behavior A clear and concise description of what you expected to happen. App should have created a record on the table and moved on to the second page.
Screenshots If applicable, add screenshots to help explain your problem.
Specifics (please complete the following information):
- Python Version: 3.11
- Reflex Version: 0.4.5
- OS: python:3.11 docker container on k3s (k8s v1.28) on Ubuntu 23.10
- Browser (Optional): Firefox; Chromium
Additional context Add any other context about the problem here.
To clarify the issue, I have in inspected the database itself, and it appears to be intact and free of defect:
If I run kubectl -n [my namespace] exec -it [name of reflex pod] -- python3 then I run select a statement on the table where the offending functionality should have inserted a record (using a cursor on SQLite3 in Python), the table appears to be intact and match the expected initial / default state. I see the table I would expect with the null record that was added by a script that the docker build runs (the null record is required for the record versioning system used). The only thing I see in the logs is:
Detected database schema changes. Run reflex db makemigrations to generate
migration scripts. ## This entry usually appeared when things were working
Debug: Could not get installation_id or project_hash: <------ #### This did not normally appear before this error appeared.
1234[redacted]1234, None
Notably the Dockerfile I used, modified from: https://github.com/reflex-dev/reflex/blob/main/docker-example/app.Dockerfile is running the migration with:
# Apply migrations before starting the backend.
RUN [ -d alembic ] && reflex db makemigrations && reflex db migrate
# <------------------------------------------ THIS ONLY RUNS AN insert STATEMENT; No SCHEMA changes are made
RUN python3 make-null-records.py
CMD caddy start && reflex run --backend-only --loglevel debug
Additional troubleshooting I have done:
I tried adding a second instance of the command RUN [ -d alembic ] && reflex db makemigrations && reflex db migrate to the Dockerfile after the command RUN python3 make-null-records.py then rebuilding and pushing a helm upgrade with the updated container image. That was unsuccessful at not resolving the issue.
I tried running the migrations after deployment on the running container just to see if that would [temporarily] resolve the issue (for reference, only one replica is running):
kubectl -n [namespace] exec -it -- sh
reflex db makemigrations
reflex db migrate
No luck. The error persisted.
Debug: Could not get installation_id or project_hash: 1234[redacted]1234, None
This message appear because the telemetry can not get the project_hash when running --backend-only.
The only consequence to this would be the telemetry message not being sent, and should have no consequence on the db migration itself.
Is the sqlite database built into the container? Or bind mounted in? Or in a docker volume?
For the immediate moment, it is sqlite without a mount. On the revisions I will make tomorrow, it will be replaced with a PVC backed Postgres.
After extensive debugging, here is what I see, now that I am getting the errors to log:
The issue indeed is not related to the migrations as far as I know. It's a reverse proxy error. In 0.4.4. everything works like a Swiss clock. When I upgrade it to 0.4.5, the websocket traffic is failing to be forwarded. If I had to guess with what info I have, I think there may be difference in how the traffic is being handled, perhaps in how the parameter API_URL="..." is handled.
{"level":"error","ts":1711905656.8950696,"logger":"http.log.error","msg":"dial tcp [::1]:8000: connect: connection refused","request":{"remote_ip":"10.xxx.xxx.xxx","remote_port":"54548","proto":"HTTP/1.1","method":"GET","host":"[my-ip].sslip.io","uri":"/_event/?
EIO=4&transport=websocket","headers":{"Accept-Encoding":["gzip, deflate, br"],"Pragma":["no-cache"],"Sec-Fetch-Mode":["websocket"],"Sec-Websocket-Extensions":["permessage-deflate"],"Sec-Websocket-Version":["13"],"Connection":["Upgrade"],"Sec-Websocket-Key":["nEdDwBLr672C3VaMlvl7Jw=="],"Upgrade":["websocket"],"X-Forwarded-For":["10.xxx.xxx.xxx"],"X-Forwarded-Port":["443"],"X-Forwarded-Server":["traefik-f4564c4f4-48lwc"],"X-Real-Ip":["10.xxx.xxx.xxx"],"User-Agent":["Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Accept":["*/*"],"Origin":["https://[my-ip].sslip.io"],"Sec-Fetch-Dest":["empty"],"Sec-Fetch-Site":["same-origin"],"Accept-Language":["en-US,en;q=0.5"],"Cache-Control":["no-cache"],"X-Forwarded-Host":["[my-ip].sslip.io"],"X-Forwarded-Proto":["wss"]}},"duration":0.000698262,"status":502,"err_id":"dmc1grtin","err_trace":"reverseproxy.statusError (reverseproxy.go:1272)"}
Info: Overriding config value api_url with env var
API_URL=http://[my-ip].sslip.io
Were you ever able to resolve this? Closing this issue for now as it seems it may be outside of reflex, but feel free to reopen if you're still facing issues.