DeepSea icon indicating copy to clipboard operation
DeepSea copied to clipboard

Add migration of MONs from levelDB to rocksDB after upgrade to Luminous

Open Martin-Weiss opened this issue 5 years ago • 4 comments

Description of Issue/Question

Just had to learn that one major difference between Luminous and pre-Luminous ceph is that previous versions use "levelDB" and new MONs deployed with Luminous use "rocksDB".

Due to performance and reliabilty reasons it might be a good idea to upgrade the MONs from levelDB to rocksDB after/as part of the migration from pre-Luminous to Luminous.

With this issue we should add a deepsea enhancement for a rolling monitor upgrade Probably just delete and re-create one mon after the other but keep in mind that this means reducing 3 -> 2 MONs or 5 -> 4 MONs temporarily.

Please lets discuss the pros and cons on this topic with this issue..

Martin-Weiss avatar Aug 15 '18 12:08 Martin-Weiss

  1. how 'recommended' or 'neccessary' is that in the eyes of upstream?

  2. at what point will an update from one version to another deprecate leveldb? ( if )

  3. We can do that with stage5, stage3 cycles right now

  4. re. reducing n->n-1. We can always first add one more and then remove one of the 'old' monitors.

jschmid1 avatar Aug 15 '18 14:08 jschmid1

Am 15.08.2018 um 16:31 schrieb Joshua Schmid [email protected]:

how 'recommended' or 'neccessary' is that in the eyes of upstream?

This needs to be answered from upstream. From my point of view „upgraded clusters“ should be identical to „new deployments“ and as all new deployments have rocksDB for the MONs the upgrade process also should upgrade the MONs db to rocksDB.

at what point will an update from one version to another deprecate leveldb? ( if )

This also needs to be answered from someone else.

We can do that with stage5, stage3 cycles right now

That was also my hope.

re. reducing n->n-1. We can always first add one more and then remove one of the 'old' monitors.

Unfortunately it is not always possible to add further MONs before removing one as this would require more servers / at least two servers for MONs per data center. (Think about having one MON in a third datacenter). Or can we add a second MON on a server that already has a MON running?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

Martin-Weiss avatar Aug 15 '18 15:08 Martin-Weiss

@jecluis Would you agree that the conversion from LevelDB to RocksDB is desirable? Could you please comment on the above? Thanks!

l-mb avatar Aug 16 '18 07:08 l-mb

I'd say this is very much desirable, given rocksdb tends to perform a lot better than leveldb, especially when compacting the store. It is also the default upstream. I'm wary about forcing the migration down an existing deployment's throat though, but having said support would definitely be ideal.

The only problem with this is that there's no online conversion mechanism, and no tooling to enable offline conversion either. Even if we had offline conversion, there would likely be no benefits from it versus removing a monitor and recreating it, given a conversion would likely take time and when it finished the monitor would likely need to synchronize with the quorum.

We may want to step back a little and figure out whether an online conversion would be feasible though. The idea of running with less monitors while we are migrating bothers me.

jecluis avatar Aug 16 '18 10:08 jecluis