server
server copied to clipboard
Mounting network storage in docker causes error
OS: Ububtu 16.04 wandb version 0.12.1
I try to start wandb local server with:
docker run --rm -d -v /home/MYUSERNAME/net_mount/wandb/:/vol -p 8080:8080 --name wandb-local wandb/local
The container starts and the web server starts.
On http://localhost:8080/home, I get the following error:
Error adding user: open /vol/env/users.htpasswd: no such file or directory
panic: Can't create default user
net_mount is mounted with my network credentials (samba) - so if docker is running with root, it might no be able to write in it.
I tried chmod 777 -R /home/MYUSERNAME/net_mount/wandb, but it didn't help.
Note: when I tried to use a local folder, chmod solved the problem.
Full log:
*** Running /etc/my_init.d/00_regen_ssh_host_keys.sh...
*** Running /etc/my_init.d/01_enable-services.sh...
*** Copying services to runit
*** Copying jobber template
*** Enabling production mode
*** Running /etc/my_init.d/02_load-settings.sh...
chmod: changing permissions of '/vol/env': Operation not permitted <-------
*** Loading settings...
2021/09/13 16:41:17 Created default user
2021/09/13 16:41:17 Generating new session key for auth...
2021/09/13 16:41:17 Generating new certificate and key for auth...
*** Booting runit daemon...
*** Runit started as PID 55
*** Setting up mysql database...
*** Starting wandb servers...
*** Configuring minio...
Bucket created successfully `local/local-files`.
Successfully added arn:minio:sqs:wandb-local:_:redis
*** Migrating database...
panic: dial tcp 127.0.0.1:3306: connect: connection refused
goroutine 1 [running]:
main.main()
/mnt/ramdisk/core/services/gorilla/cmd/migrate/main.go:50 +0xb73
*** Setting up mysql database...
*** Migrating database...
panic: dial tcp 127.0.0.1:3306: connect: connection refused
goroutine 1 [running]:
main.main()
/mnt/ramdisk/core/services/gorilla/cmd/migrate/main.go:50 +0xb73
*** Setting up mysql database...
*** Migrating database...
panic: dial tcp 127.0.0.1:3306: connect: connection refused
goroutine 1 [running]:
main.main()
/mnt/ramdisk/core/services/gorilla/cmd/migrate/main.go:50 +0xb73
*** Setting up mysql database...
You can specify a userid for the docker container to launch as. As long as the container is started with the root group (0) and that group has write permissions to all content in /vol
the system will work.
It's important to note that running the container with any persistence mounted in it is intended for trial purposes only. You should license the server and connect it to an external MySQL database and S3 compatible object store for any production deployments.
when starting docker with uid and gid
docker run --rm -v wandb:/vol -p 8080:8080 --name wandb-local --user $(id -u):$(id -g) wandb/local
I get the following error in STDOUT:
*** Killing all processes...
Traceback (most recent call last):
File "/sbin/my_init", line 430, in <module>
main(args)
File "/sbin/my_init", line 341, in main
export_envvars()
File "/sbin/my_init", line 116, in export_envvars
with open("/etc/container_environment/" + name, "w") as f:
PermissionError: [Errno 13] Permission denied: '/etc/container_environment/PATH'
UPDATE
I added my user to root group ( sudo usermod -a -G root my_user_name
)
and then I called docker run with gid 0 (docker run --rm -v wandb:/vol -p 8080:8080 --name wandb-local --user $(id -u):0 wandb/local
),
now the errors in STDOUT are:
*** Running /etc/my_init.d/00_regen_ssh_host_keys.sh...
*** Running /etc/my_init.d/01_enable-services.sh...
*** Copying services to runit
*** Setting OpenShift default user
*** Copying jobber template
*** Enabling production mode
*** Running /etc/my_init.d/02_load-settings.sh...
*** Loading settings...
*** Booting runit daemon...
*** Runit started as PID 73
*** Setting up mysql database...
*** Starting wandb servers...
*** Migrating database...
panic: dial tcp 127.0.0.1:3306: connect: connection refused
goroutine 1 [running]:
main.main()
/mnt/ramdisk/core/services/gorilla/cmd/migrate/main.go:50 +0xb73
*** Setting up mysql database...
*** Migrating database...
panic: dial tcp 127.0.0.1:3306: connect: connection refused
UPDATE deleting all the files from the local folder allowed wandb to startup (from scratch) correctly.
Not sure if you want to call this a bug or not...
@adizhol what local folder did you delete files from?
Hi @EricWiener, I wasn't able to identify which files/folders @adizhol deleted from review of their posts. The statement (from scratch) reads to me as deleting their entire docker image/container and creating a new local instance.