docker icon indicating copy to clipboard operation
docker copied to clipboard

Error when performing migrations on intial startup (wger library fails to install)

Open sizzlesloth opened this issue 1 year ago • 1 comments

Using the latest tag for the wger/server image (or 2.3-dev) results in the following error when doing docker compose up. This issue does not occur with 2.2-dev.

This was tested on a clean setup with no volumes previously created, using the docker compose file; the only change being the port mapping of the server to 8000. I never got this issue until recently when I pulled the new latest image in (from today, the latest pulled in a day ago had this issue, as well as the one before, but the image from around a week ago did not - apologies for vagueness).

wger_server  | Successfully built wger
wger_server  | Installing collected packages: wger
wger_server  |   Attempting uninstall: wger
wger_server  |     Found existing installation: wger 2.3.0a1
wger_server  |     Uninstalling wger-2.3.0a1:
wger_server  | ERROR: Could not install packages due to an OSError: [Errno 22] Invalid argument: '/home/wger/.local/bin/'
wger_server  | 
wger_server  | /home/wger/entrypoint.sh: line 24: wger: command not found
wger_server  | Running in production mode, running collectstatic now
wger_server  | INFO 2024-06-26 23:26:26,709 apps AXES: BEGIN version 6.5.0, blocking by ip_address
wger_server  | 
wger_server  | 0 static files copied to '/home/wger/static', 10140 unmodified.
wger_server  | Performing database migrations
wger_server  | INFO 2024-06-26 23:26:28,536 apps AXES: BEGIN version 6.5.0, blocking by ip_address
wger_db      | 2024-06-26 21:26:28.576 UTC [99] ERROR:  relation "exercises_exercisebase" does not exist at character 75
wger_db      | 2024-06-26 21:26:28.576 UTC [99] STATEMENT:  SELECT COUNT(*) FROM (SELECT "exercises_exercisebase"."id" AS "col1" FROM "exercises_exercisebase" LEFT OUTER JOIN "exercises_exercise" ON ("exercises_exercisebase"."id" = "exercises_exercise"."exercise_base_id") GROUP BY 1 HAVING COUNT("exercises_exercise"."id") = 0) subquery
wger_server  | Traceback (most recent call last):
wger_server  |   File "/usr/local/lib/python3.10/dist-packages/django/db/backends/utils.py", line 89, in _execute
wger_server  |     return self.cursor.execute(sql, params)
wger_server  | psycopg2.errors.UndefinedTable: relation "exercises_exercisebase" does not exist
wger_server  | LINE 1: ...LECT "exercises_exercisebase"."id" AS "col1" FROM "exercises...
wger_server  |                                                              ^
wger_server  | 
wger_server  | 
wger_server  | The above exception was the direct cause of the following exception:

...

Looks like the two issues are the wger libary not installing, and the database relationships not being created (possibly because of the first issue?).

I tried starting the db container before everything else to determine if it was some type of race condition, but it didn't solve the issue.

sizzlesloth avatar Jun 26 '24 21:06 sizzlesloth

I did some investigating and the wger library does not install after pip3 install -e .. When exec'ing into the container and executing this command, I get "invalid argument 'licenses'". Manually installing it as the root user instead of the wger user allows the package to install.

sizzlesloth avatar Jul 03 '24 00:07 sizzlesloth

The cause was a file system issue. When the wger package is installed with pip, I believe it's copied across from host to container at some point? Pip does an os.rename() under the hood when installing the wger package with pip; this absolutely sucks when moving a file across different file systems.

Installing collected packages: wger
web-1  |   Attempting uninstall: wger
web-1  |     Found existing installation: wger 2.3.0a2
web-1  |     Uninstalling wger-2.3.0a2:
web-1  |       Created temporary directory: /home/wger/.local/~in
web-1  | ERROR: Could not install packages due to an OSError.
web-1  | Traceback (most recent call last):
web-1  |   File "/usr/lib/python3.10/shutil.py", line 816, in move
web-1  |     os.rename(src, real_dst)
web-1  | OSError: [Errno 18] Invalid cross-device link: '/home/wger/.local/bin/' -> '/home/wger/.local/~in'

My Docker data directory has been reconfigured to be on a ZFS file system. The default file system driver that Docker uses is overlayfs - turns out os.rename() doesn't work (or just sucks) when moving files across different file systems.

I imagine a potential solution would be to use the ZFS Docker file system driver, but creating a volume at /home/wger also fixed the issue.

I'm slightly confused as to why this was the fix though. Do you know the point where we are crossing different file systems? It seems like the package is built to /tmp in the container, and moved to the wger user's home directory - also in the container, so it should all be overlayfs.

Much appreciated if you have any advice to give on this, but will close the issue.

sizzlesloth avatar Aug 31 '24 00:08 sizzlesloth