dattobd
dattobd copied to clipboard
remove the limit that cow file must exist on the volume that will be snapshotted
some use cases involves taking block-level snapshots and incremental backups on a raw device(without filesystem), such as a kvm virtual machine, it's disk image directly stored on a raw disk. Or a oracle database, it use raw disk to store data. If we use different volume to store cow file, the base device that will be snapshotted will have less i/o pressure, as cow file will be write to other device. But now, dattobd only supports cow file exist on the volume that will be snapshotted. maybe we could remove the limit and we can also support cow file directly stored on a another raw device to gain more flexibility.
The requirement is a leftover of some of the older implementation details (before the project was put in github) and can probably be reverted now. We can look into this at some point. I'm currently backed up quite a bit with other work. If you would like, you can submit a pull request to add this functionality. Theoretically, all that needs to happen is this:
- The check for the file being on the specified volume would need to be removed.
- The inode pointer in
struct snap_device
will need to be set to NULL in this case. - We would need to check that nothing has broken as a result of this change and that COW files on separate volumes work as intended.
That said, there could be other regressions that come up that I am not aware of.
I believe it would also have the benefit of being able to freeze the filesystem during backups to ensure consistency.
@c3mb0 Could you explain what you mean by this? We currently ensure filesystem level consistency by freezing the filesystem for a moment by taking the snapshot, and I'm not sure how this would help application consistency.
What I said was in reference to #134; I have thousands of bolt files and there is currently no way to quiesce bolt operations. It's main functionality relies on mmap
(about 50G of files in memory in my case) and it keeps a freelist of pages that have been deleted for reuse. The machine handles thousands of operations per second so things change very quickly.
I'm under the impression that freezing the filesystem is the only proper way to create a consistent backup. Pardon my ignorance if dattobd can in fact handle this situation, I am not quite familiar with block-level functionalities.
some use cases involves taking block-level snapshots and incremental backups on a raw device(without filesystem), such as a kvm virtual machine, it's disk image directly stored on a raw disk. Or a oracle database, it use raw disk to store data. If we use different volume to store cow file, the base device that will be snapshotted will have less i/o pressure, as cow file will be write to other device. But now, dattobd only supports cow file exist on the volume that will be snapshotted. maybe we could remove the limit and we can also support cow file directly stored on a another raw device to gain more flexibility.
I have implemented this function you want. If you need,please contact me ([email protected])
@oracleloyall We would definitely be interested in seeing what you have. It would probably be easiest if you just made a pull request against this repository. Or if you are not setup with git, you can email the changes to me at [email protected]