reflex icon indicating copy to clipboard operation
reflex copied to clipboard

issue with db migrate with k8s deployment on v 0.4.5

Open david-thrower opened this issue 5 months ago β€’ 5 comments

A clear and concise description of what the bug is.

  • I deploy in v0.4.4:
    • everything works. The app is usable.
  • I upgrade the line on requirements.txt from reflex==0.4.4 to reflex==0.4.5.
    • I make a submission on the page that creates a new record in a table (using ORM) ... I get the message "an error occurred. See logs for details". In the logs, I see the message you would expect to see if there were migrations that needed to be run, but were yet to be run.

To Reproduce Steps to reproduce the behavior:

  • Code/Link to Repo:

Proprietary component

Expected behavior A clear and concise description of what you expected to happen. App should have created a record on the table and moved on to the second page.

Screenshots If applicable, add screenshots to help explain your problem.

new-orm-bug

Specifics (please complete the following information):

  • Python Version: 3.11
  • Reflex Version: 0.4.5
  • OS: python:3.11 docker container on k3s (k8s v1.28) on Ubuntu 23.10
  • Browser (Optional): Firefox; Chromium

Additional context Add any other context about the problem here.

To clarify the issue, I have in inspected the database itself, and it appears to be intact and free of defect:

If I run kubectl -n [my namespace] exec -it [name of reflex pod] -- python3 then I run select a statement on the table where the offending functionality should have inserted a record (using a cursor on SQLite3 in Python), the table appears to be intact and match the expected initial / default state. I see the table I would expect with the null record that was added by a script that the docker build runs (the null record is required for the record versioning system used). The only thing I see in the logs is:

Detected database schema changes. Run reflex db makemigrations to generate 
migration scripts. ## This entry usually appeared when things were working
Debug: Could not get installation_id or project_hash:  <------ #### This did not normally appear before this error appeared.  
1234[redacted]1234, None

Notably the Dockerfile I used, modified from: https://github.com/reflex-dev/reflex/blob/main/docker-example/app.Dockerfile is running the migration with:

# Apply migrations before starting the backend.
RUN  [ -d alembic ] && reflex db makemigrations && reflex db migrate

# <------------------------------------------      THIS ONLY RUNS AN insert STATEMENT; No SCHEMA changes are made
RUN python3 make-null-records.py

CMD caddy start && reflex run --backend-only --loglevel debug

david-thrower avatar Mar 26 '24 20:03 david-thrower

Additional troubleshooting I have done:

I tried adding a second instance of the command RUN [ -d alembic ] && reflex db makemigrations && reflex db migrate to the Dockerfile after the command RUN python3 make-null-records.py then rebuilding and pushing a helm upgrade with the updated container image. That was unsuccessful at not resolving the issue.

I tried running the migrations after deployment on the running container just to see if that would [temporarily] resolve the issue (for reference, only one replica is running):

kubectl -n [namespace] exec -it -- sh
reflex db makemigrations 
reflex db migrate

No luck. The error persisted.

david-thrower avatar Mar 26 '24 21:03 david-thrower

Debug: Could not get installation_id or project_hash: 1234[redacted]1234, None

This message appear because the telemetry can not get the project_hash when running --backend-only. The only consequence to this would be the telemetry message not being sent, and should have no consequence on the db migration itself.

Lendemor avatar Mar 27 '24 15:03 Lendemor

Is the sqlite database built into the container? Or bind mounted in? Or in a docker volume?

masenf avatar Mar 28 '24 19:03 masenf

For the immediate moment, it is sqlite without a mount. On the revisions I will make tomorrow, it will be replaced with a PVC backed Postgres.

david-thrower avatar Mar 29 '24 00:03 david-thrower

After extensive debugging, here is what I see, now that I am getting the errors to log:

The issue indeed is not related to the migrations as far as I know. It's a reverse proxy error. In 0.4.4. everything works like a Swiss clock. When I upgrade it to 0.4.5, the websocket traffic is failing to be forwarded. If I had to guess with what info I have, I think there may be difference in how the traffic is being handled, perhaps in how the parameter API_URL="..." is handled.

{"level":"error","ts":1711905656.8950696,"logger":"http.log.error","msg":"dial tcp [::1]:8000: connect: connection refused","request":{"remote_ip":"10.xxx.xxx.xxx","remote_port":"54548","proto":"HTTP/1.1","method":"GET","host":"[my-ip].sslip.io","uri":"/_event/?


EIO=4&transport=websocket","headers":{"Accept-Encoding":["gzip, deflate, br"],"Pragma":["no-cache"],"Sec-Fetch-Mode":["websocket"],"Sec-Websocket-Extensions":["permessage-deflate"],"Sec-Websocket-Version":["13"],"Connection":["Upgrade"],"Sec-Websocket-Key":["nEdDwBLr672C3VaMlvl7Jw=="],"Upgrade":["websocket"],"X-Forwarded-For":["10.xxx.xxx.xxx"],"X-Forwarded-Port":["443"],"X-Forwarded-Server":["traefik-f4564c4f4-48lwc"],"X-Real-Ip":["10.xxx.xxx.xxx"],"User-Agent":["Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Accept":["*/*"],"Origin":["https://[my-ip].sslip.io"],"Sec-Fetch-Dest":["empty"],"Sec-Fetch-Site":["same-origin"],"Accept-Language":["en-US,en;q=0.5"],"Cache-Control":["no-cache"],"X-Forwarded-Host":["[my-ip].sslip.io"],"X-Forwarded-Proto":["wss"]}},"duration":0.000698262,"status":502,"err_id":"dmc1grtin","err_trace":"reverseproxy.statusError (reverseproxy.go:1272)"}
Info: Overriding config value api_url with env var 
API_URL=http://[my-ip].sslip.io

david-thrower avatar Mar 31 '24 17:03 david-thrower