server icon indicating copy to clipboard operation
server copied to clipboard

Loading is taking longer than expected

Open edebrouwer opened this issue 3 years ago • 6 comments

Hi, while trying to launch the wandb server, localhost:8080 stays stuck on the following message :

Loading your local environment... Loading is taking longer than expected. Check stdout, or the system logs at /var/log for error messages. You can restart your server with the environment variable LOCAL_RESTORE=true to regain access if you're unable to login

Output from wandb local --no-daemon :

wandb: A new version of W&B local is available, upgrade by calling wandb local --upgrade *** Running /etc/my_init.d/00_regen_ssh_host_keys.sh... *** Running /etc/my_init.d/01_enable-services.sh... *** Running /etc/my_init.d/02_load-settings.sh... *** Booting runit daemon... *** Runit started as PID 38 panic: Dirty database version 5. Fix and force version.

goroutine 1 [running]: main.main() /mnt/ramdisk/core/services/gorilla/cmd/migrate/main.go:115 +0xb05

Output from docker exec -t wand-local cat / var/log/mysql.log

2021-05-29T13:45:39.050538Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2021-05-29T13:45:39.052381Z 0 [Note] mysqld (mysqld 5.7.34-0ubuntu0.18.04.1) starting as process 106 ... 2021-05-29T13:45:39.054131Z 0 [Warning] One can only use the --user switch if running as root

2021-05-29T13:45:39.055997Z 0 [Note] InnoDB: PUNCH HOLE support available 2021-05-29T13:45:39.056016Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2021-05-29T13:45:39.056022Z 0 [Note] InnoDB: Uses event mutexes 2021-05-29T13:45:39.056026Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier 2021-05-29T13:45:39.056031Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2021-05-29T13:45:39.056035Z 0 [Note] InnoDB: Using Linux native AIO 2021-05-29T13:45:39.056936Z 0 [Note] InnoDB: Number of pools: 1 2021-05-29T13:45:39.057054Z 0 [Note] InnoDB: Using CPU crc32 instructions 2021-05-29T13:45:39.059197Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M 2021-05-29T13:45:39.066667Z 0 [Note] InnoDB: Completed initialization of buffer pool 2021-05-29T13:45:39.069002Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 2021-05-29T13:45:39.081081Z 0 [Note] InnoDB: Highest supported file format is Barracuda. 2021-05-29T13:45:39.119609Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2021-05-29T13:45:39.119683Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2021-05-29T13:45:39.216224Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2021-05-29T13:45:39.216760Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active. 2021-05-29T13:45:39.216774Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active. 2021-05-29T13:45:39.217724Z 0 [Note] InnoDB: Waiting for purge to start 2021-05-29T13:45:39.267849Z 0 [Note] InnoDB: 5.7.34 started; log sequence number 8761776 2021-05-29T13:45:39.267999Z 0 [Note] InnoDB: Loading buffer pool(s) from /vol/mysql/ib_buffer_pool 2021-05-29T13:45:39.268702Z 0 [Note] Plugin 'FEDERATED' is disabled. 2021-05-29T13:45:39.270258Z 0 [Note] InnoDB: Buffer pool(s) load completed at 210529 13:45:39 2021-05-29T13:45:39.274401Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them. 2021-05-29T13:45:39.274418Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory. 2021-05-29T13:45:39.274954Z 0 [Warning] CA certificate ca.pem is self signed. 2021-05-29T13:45:39.274989Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory. 2021-05-29T13:45:39.275063Z 0 [Note] Server hostname (bind-address): '*'; port: 3306 2021-05-29T13:45:39.275096Z 0 [Note] IPv6 is available. 2021-05-29T13:45:39.275103Z 0 [Note] - '::' resolves to '::'; 2021-05-29T13:45:39.275124Z 0 [Note] Server socket created on IP: '::'. 2021-05-29T13:45:39.301119Z 0 [Note] Event Scheduler: Loaded 0 events 2021-05-29T13:45:39.301330Z 0 [Note] mysqld: ready for connections. Version: '5.7.34-0ubuntu0.18.04.1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Ubuntu) 2021-05-29T13:45:39.915944Z 5 [Note] Aborted connection 5 to db: 'wandb_local' user: 'wandb_local' host: '127.0.0.1' (Got an error reading communication packets) 2021-05-29T13:45:39.915961Z 4 [Note] Aborted connection 4 to db: 'wandb_local' user: 'wandb_local' host: '127.0.0.1' (Got an error reading communication packets) exec mysqld >> /var/log/mysql.log 2>&1

I already tried to to docker stop wandb-local docker volume rm wandb wandb local

But the problem persists.... Could you please help me with this ? Thanks a lot.

System : Ubuntu 18.05.5 Wandb version : 0.10.31 Python : 3.7.9

edebrouwer avatar May 29 '21 13:05 edebrouwer

@edebrouwer this means the database was unable to migrate. Version 5 is very low, we need to migrate 80 versions on boot. This is usually caused by killing the docker instance while it's migrating (if you ctrl-c the --no-daemon mode that would kill the migration). It could also be an issue with your docker IO configuration. I would try the following:

docker stop wandb-local
docker volume rm wandb
wandb local
# Wait 60 seconds, then run:
docker logs wandb-local

Then share the log output here. It usually takes 15-30 seconds to migrate the database from scratch. If it's taking much longer than that, you likely need to tweak your docker IO settings. Can you also share the output of docker info?

vanpelt avatar May 29 '21 19:05 vanpelt

Hi thank you for your quick response ! I waited 10 minutes and now it's up :) Actually i'm running it on a virtual machine and the machine was stopped last night without stopping the docker instance first. Could that be the source of the initial issue ? (sorry if this is a stupid question) Thanks !

edebrouwer avatar May 29 '21 23:05 edebrouwer

Something similar occurs to me due to mysql stopping itself. https://github.com/wandb/local/issues/40

bzamecnik avatar Oct 08 '21 08:10 bzamecnik

Hey @bzamecnik you really shouldn't be running the instance with MySQL inside of the container. It's only for trial purposes and we should connect the instance to an external MySQL for any production deployments. I'll follow up in #40

vanpelt avatar Oct 08 '21 21:10 vanpelt

@vanpelt Thanks. Yeah, of course. At the moment the installation is for trial purposes. For production usage it can be installed within k8s using RDS.

bzamecnik avatar Oct 11 '21 05:10 bzamecnik

@bzamecnik exactly, the terraform in this repository can make that really straight forward.

vanpelt avatar Oct 11 '21 21:10 vanpelt