MonetDB icon indicating copy to clipboard operation
MonetDB copied to clipboard

Ability to quiesce a database

Open muhmud opened this issue 4 years ago • 10 comments

Is your feature request related to a problem? Please describe. I would like to be able to keep a monetdb database accessible for read only operations while I perform a storage-level snapshot, such as with the zfs snapshot command.

Describe the solution you'd like Most database systems provide the capability of freezing a database so that writes can not be performed, but connections are not closed, and then thawing, in order to resume normal operation.

Describe alternatives you've considered I am currently stopping monetdb, performing the snapshot, and then starting it again, which means that existing client sessions are closed.

muhmud avatar Feb 01 '21 04:02 muhmud

There's a --readonly option that can be used while the server starts. However it's not possible to change at runtime.

PedroTadim avatar Feb 01 '21 08:02 PedroTadim

How about hot snapshot?

yzchang avatar Feb 01 '21 08:02 yzchang

Yes, I need something I can use while the server is running.

I checked the link for hot snapshots, however, I'm afraid it wouldn't be suitable, as this creates an export of the database. What I'm looking to do leverages the storage system to create an instant point-in-time copy of an entire database, regardless of it's size.

muhmud avatar Feb 01 '21 09:02 muhmud

Hi Muhmud, if you're using zfs-style snapshots and the bat directory and the sql_log directory are on the same filesystem, you need not bother quiescing the system. If a transaction commits, all data is flushed to disk and if a transaction has not yet committed, any of its already on-disk data is clearly marked as uncommitted. This means if you restart MonetDB from a snapshot it will come up in a correct and consistent state compared to the point in time when the snapshot was taken.

In fact, if we may believe to the manual page of zfs(8), the bat directory and the sql_log directory should still come out consistent if they are on different filesystems but their snapshots were taken with a single, recursive, snapshot command:

zfs snapshot [-r] [-o property=value] ... filesystem@snapname|volume@snapname
    Creates a snapshot with the given name. All previous modifications by successful system calls to the file system are part of the snapshot. See the "Snapshots" section for details.

    -r Recursively create snapshots of all descendent datasets. Snapshots are taken atomically, so that all recursive snapshots correspond to the same moment in time. 

Zfs is pretty amazing but as far as I know we have never tested this scenario.

Having said all this, I agree that having a way to temporarily suspend all write to the database could be a useful feature in a wide range of circumstances, especially with filesystems . Currently, a lot of work is being done on the transaction layer, maybe this can be included at some point.

joerivanruth avatar Feb 01 '21 10:02 joerivanruth

Hi @joerivanruth , thank you for your detailed response.

I completely understand where you're coming from, and I think this is true for most database systems, i.e. they will perform a process of recovery on startup and will correctly roll forward/roll back transactions, as required, to bring the database to a consistent point. Having said that, I still usually opt for quiescing the database in order to prevent scenarios that could potentially cause corruption, at least on other platforms, such as half a block being written to disk. I think this is the standard practice for storage-level snapshots.

On a separate note, could you tell me the block size used by monetdb? I couldn't find it in the documentation, and would like to align my ZFS recordsize to the same value.

I currently have my entire dbfarm on a single filesystem.

muhmud avatar Feb 01 '21 10:02 muhmud

Interesting question about the block size, I'll ask around.

joerivanruth avatar Feb 02 '21 09:02 joerivanruth

I asked my colleague, he says that by default, MonetDB uses 64k pages. This can be configured by starting mserver5 with the flag --set gdk_mmap_pagesize=X with X between 4k and 1G.

When you have gained some experience with MonetDB on ZFS I'd be delighted to hear about it.

joerivanruth avatar Feb 03 '21 12:02 joerivanruth

Thanks, that works out great, as I happened to choose 64k for ZFS :)

Sure thing; I've been using monetdb on ZFS while developing our business intelligence solution, but we'll be going to production soon, so will have a lot more information over the next month.

muhmud avatar Feb 05 '21 00:02 muhmud

Please drop us a reference to your BI solution

ghost avatar Apr 29 '21 08:04 ghost

The BI solution was for a company called Open Practice Solutions, though it's a closed-source proprietary application.

They may at some point put some literature on their website about it.

@joerivanruth MonetDB works extremely well on ZFS, by the way, very performant, and I was getting compression ratios of around 15x using lz4. AWS, however, would have a tendency to throttle GP2 disks when burst credits had been consumed, which would cause large imports to slow down heavily. From my own testing at the time, GP3 did not exhibit better performance than GP2, though I'm not sure if these issues have been resolved now.

muhmud avatar May 02 '21 12:05 muhmud