ZODB
ZODB copied to clipboard
ZEO server (5.1.0) does not store its database index properly during a proper shutdown
Plone 5.1 uses ZODB 5.3.0 and ZEO 5.1.1.
During a migration I encountered the fact that restarting a ZEO server (proper stop/start sequence) caused the typical "Ignoring index ...." message. That means that the database index file was not written properly or is ignored for some reason during stop/start. There are no disk resource problems.
I reproduced the issue with a fresh Data.fs
2018-03-26T13:00:49 (17064) terminated by SIGINT
2018-03-26T13:00:49 closing storage '1'
2018-03-26T13:00:49 (17064) removed PID file '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/zeo.pid'
2018-03-26T13:00:52 (17071) created PID file '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/zeo.pid'
2018-03-26T13:00:52 (17071) opening storage '1' using FileStorage
2018-03-26T13:00:52 Ignoring index for /home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/filestorage/Data.fs
2018-03-26T13:00:52 StorageServer created RW with storages: 1:RW:/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/filestorage/Data.fs
2018-03-26T13:00:52 listening on ('', 5200)
2018-03-26T13:00:55 (17071) terminated by SIGINT
2018-03-26T13:00:55 closing storage '1'
2018-03-26T13:00:55 (17071) removed PID file '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/zeo.pid'
2018-03-26T13:01:02 daemonizing the process
2018-03-26T13:01:02 set current directory: '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/parts/zeo'
2018-03-26T13:01:02 daemon manager started
2018-03-26T13:01:02 spawned process pid=17086
2018-03-26T13:01:03 (17086) created PID file '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/zeo.pid'
2018-03-26T13:01:03 (17086) opening storage '1' using FileStorage
2018-03-26T13:01:03 Ignoring index for /home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/filestorage/Data.fs
2018-03-26T13:01:03 StorageServer created RW with storages: 1:RW:/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/filestorage/Data.fs
2018-03-26T13:01:03 listening on ('', 5200)
2018-03-26T13:01:06 (17086) terminated by SIGTERM
2018-03-26T13:01:06 closing storage '1'
2018-03-26T13:01:06 (17086) removed PID file '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/zeo.pid'
2018-03-26T13:01:06 pid 17086: exit status 0
2018-03-26T13:01:06 Exiting
2018-03-26T13:01:11 daemonizing the process
2018-03-26T13:01:11 set current directory: '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/parts/zeo'
2018-03-26T13:01:11 daemon manager started
2018-03-26T13:01:11 spawned process pid=17107
2018-03-26T13:01:11 (17107) created PID file '/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/zeo.pid'
2018-03-26T13:01:11 (17107) opening storage '1' using FileStorage
2018-03-26T13:01:11 Ignoring index for /home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/filestorage/Data.fs
2018-03-26T13:01:11 StorageServer created RW with storages: 1:RW:/home/andreas.jung.extern/sandboxes/plone-server-buildout-plone5/var/filestorage/Data.fs
2018-03-26T13:01:11 listening on ('', 5200)
I think this is not an issue of Plone 5.1 as this happens sometimes with ZODB version used in Plone 4.3 also.
It is unrelated to Plone for sure. However I have never seen this constant failure with previous ZODB/ZEO versions. Of course an index file may become inconsistent e.g. when you stuff around etc...but in this case I see the odd behavior even with a fresh ZODB/ZEO.
This may be an issue with zdaemon being impatient and killing with -9 before the index file can be written. What version of zdaemon are you using? How large the database? Do you have a stop-timeout setting?
zdaemon 4.2
Plone does not generate a stop-timeout configuration.
This is first showed up with a 15 GB Data.fs but it is also reproducible with a fresh and empty Data.fs.
Weird, I can't reproduce this.
What platform is this? Looks like linux.
yes, this is Linux, a recent Debian version. Let's me try on a different system...as said: this happens with a Plone 5 installation. Perhaps I need to dig into this why the index is being ignored on my system.
I an looking for the same issue with a 4.3 zodb. in my case I found an index file with index_tmp suffix, suggesting that something killed my zeoserver while it was trying to shut down. @zopyx did you also see such a file? my db has over 100gb and takes over an hour to reindex...
I'm 99% sure this is due to zdaemon (or whatever you're using) getting impatient and kill -9 ing your ZEO process before it has had time to save the index. We ran into this at ZC and had to configure zdaemon to wait longer.
zdaemon 3.0.0 added the stop-timeout option to say how long to wait for a graceful shutdown. I suggest setting this to a hight number (30 minutes, probably too long, but ...).