zfs icon indicating copy to clipboard operation
zfs copied to clipboard

It would be nice to have a 'safe mode' for zfs and zpool commands

Open dswartz opened this issue 9 years ago • 35 comments

I really don't like that I can mistype something and vaporize datasets, and even entire pools. I would like to propose a property defined for datasets, zvols and pools that basically says 'if this is set to ON, any destructive operation will require use of the -f flag'. Thinking of destroying datasets, zvols and pools. Maybe snapshots too? This could really be a totally userland abstraction. What do y'all think?

dswartz avatar Dec 22 '15 18:12 dswartz

:+1:

kernelOfTruth avatar Dec 22 '15 18:12 kernelOfTruth

I would be happy to take a shot at this...

dswartz avatar Dec 22 '15 18:12 dswartz

I think that's a totally reasonable idea which would be a great feature. Can you flesh out a little more detail about how it should work. Also I'd be cautious about a -f option which is already heavily overloaded. It might be safer to force users to explicitly disable this 'safe mode' before allowing the operation. Or how about a locked=on|off property which could be set on the dataset to prevent administrative changes?

behlendorf avatar Dec 22 '15 18:12 behlendorf

Hmm, well, I am not thinking of locking the entity per-se, just making it harder to screw yourself. Although, thinking about this more, I kinda like your idea. If I want to blow away 'tank/vsphere', where all my VMs live, it's not unreasonable for me to have to do 'zfs set locked=off tank/vsphere' first. So, yeah, that sounds good to me. Was there anything else?

dswartz avatar Dec 22 '15 18:12 dswartz

I think you'll want to think through exactly what commands this should apply too.

behlendorf avatar Dec 22 '15 18:12 behlendorf

-f flags are bad UI design, you just end up training people to always type -f (see also kill -9)

In general, If you don't want to destroy a dataset, don't type "destroy" :-p

NB, there are UIs that mark datasets for later destruction, giving you an opportunity to change your mind. However, the most common use case for dataset destroy is to reclaim space, so the deferred destroy is not a general solution.

zpool destroy is reversible, no need for more UI complexity.

-- richard

On Dec 22, 2015, at 12:32 PM, Brian Behlendorf [email protected] wrote:

I think you'll want to think through exactly what commands this should apply too.

— Reply to this email directly or view it on GitHub.

richardelling avatar Dec 22 '15 18:12 richardelling

On 2015-12-22 13:40, Richard Elling wrote:

-f flags are bad UI design, you just end up training people to always type -f (see also kill -9)

In general, If you don't want to destroy a dataset, don't type "destroy" :-p

Yes, thank you. I'm familiar with the chainsaw with no safety guard :)

NB, there are UIs that mark datasets for later destruction, giving you an opportunity to change your mind. However, the most common use case for dataset destroy is to reclaim space, so the deferred destroy is not a general solution.

zpool destroy is reversible, no need for more UI complexity.

Fair enough, I'd forgotten about that. That said, I don't think it's that complex to have any commands that change anything with a pool or dataset or zvol query that property. The small amount of heavy lifting can be encapsulated in one fairly small procedure.

dswartz avatar Dec 22 '15 18:12 dswartz

On Dec 22, 2015, at 10:51 AM, dswartz [email protected] wrote:

On 2015-12-22 13:40, Richard Elling wrote:

-f flags are bad UI design, you just end up training people to always type -f (see also kill -9)

In general, If you don't want to destroy a dataset, don't type "destroy" :-p

Yes, thank you. I'm familiar with the chainsaw with no safety guard :)

What other subcommand can you confuse with "destroy?" Answer: none

So the problem you're trying to solve is one of naming datasets. Since there is no enforced naming, you might try a less failure-prone convention. For example: mypool/dataset1, mypool/dataset11, mypool/dataset2 is more failure-prone than: mypool/Alice, mypool/Bob, mypool/Godzilla

Fully automated systems, when designed well, are not subject to this problem as they won't mistype.

NB, there are UIs that mark datasets for later destruction, giving you an opportunity to change your mind. However, the most common use case for dataset destroy is to reclaim space, so the deferred destroy is not a general solution.

zpool destroy is reversible, no need for more UI complexity.

Fair enough, I'd forgotten about that. That said, I don't think it's that complex to have any commands that change anything with a pool or dataset or zvol query that property. The small amount of heavy lifting can be encapsulated in one fairly small procedure.

Low effort for sure, but also not effective, which is why it doesn't exist.

For those listening at home, this was one of the first great debates when ZFS was first released. -- richard

richardelling avatar Dec 22 '15 21:12 richardelling

Well, to pitch in $.02, I am always terrified of needing to use "zfs destroy foo/bar@baz" to nuke snapshots, and consider the overloading of "destroy" here to be a little on the hazardous side of things. I'd much rather an "unsnapshot" command or something that errored if it was given a name without @.

nwf avatar Dec 25 '15 07:12 nwf

What other subcommand can you confuse with "destroy?"

As nwf said, 'destroy' is used to destroy both, datasets and snapshots. Removing a snapshot from a dataset always feels a bit risky, thus. Until entering '@', you typed a valid command to accidentally destroy your dataset.

CroneKorkN avatar Jun 01 '17 11:06 CroneKorkN

but destroy will not destroy a dataset with children unless passed "-r" as well?

so if you want to type zfs destroy mypool/mydataset@mysnapshot but accidentally hit enter after zfs destroy mypool/mydataset, you will get

cannot destroy 'mypool/mydataset': filesystem has children
use '-r' to destroy the following datasets:
mypool/mydataset@mysnapshot
mypool/mydataset@myothersnapshotwhichiwanttokeep

next step: train yourself to put "-r" AFTER the dataset name, not in front (to prevent the same problem of premature ending of a command when you actually want to recursively delete a dataset)

Fabian-Gruenbichler avatar Jun 01 '17 12:06 Fabian-Gruenbichler

But if there is no snapshot for some reason, the dataset will either be destroyed. I am using dummy-datasets like tank/important-data/dont-delete-me to guarantee the existence of a child-dataset. Separating the commands for handling snapshots from the ones handling datasets or being able to lock a dataset would really feel great.

Thanks for the recommendation to put the '-r'-flag after the target-name.

CroneKorkN avatar Jun 01 '17 12:06 CroneKorkN

I would like to vote for adding a settable flag to pools, datasets, & snapshots, that is something like "protected", that can be set to on/off or yes/no. If the object is marked as destroyable, it behaves as normal: a call to zfs destroy myThing destroys it, no questions asked. However, if you manually set the flag to protect the object, it behaves differently. It is still writable/modifiable if the readOnly flag is off, but any call to zfs destroy myThing will abort and error out; something like: "Cannot destroy 'myThing'. Pool/Dataset/Snapshot is protected."

Also, the ability to do a recursive set=on would be useful, but a recursive set=off, seems somewhat dangerous. Perhaps a confirmations message, but, maybe I'm just paranoid.

[yurelle@host ~] zfs set -r protected=on myThing
[yurelle@host ~] zfs destroy myThing
Cannot destroy 'myThing'. Pool/Dataset/Snapshot is protected.
If you wish to destroy it, disable the "protected" flag.
[yurelle@host ~] zfs set -r protected=off myThing
This will remove protection from all objects under 'MyThing'. Are you sure? (yes/no) _

Where entering no, or Ctrl+c 'ing abort the change.

@behlendorf An additional flag of "locked" to prevent all admin changes also sounds good. I imagine each being set independently, but if "locked" is on, maybe it forces "protected" to true?

-Yurelle

yurelle avatar Apr 15 '18 20:04 yurelle

Holds are only good for snapshots though.

While you could make a snapshot and put a hold on it to prevent the main dataset's destruction, that snapshot references data you might want to free up. It works, but it's not great.

DeHackEd avatar Apr 15 '18 21:04 DeHackEd

On Apr 15, 2018, at 1:56 PM, yurelle [email protected] wrote:

I would like to vote for adding a settable flag to pools, datasets, & snapshots, that is something like "protected", that can be set to on/off or yes/no. If the object is marked as destroyable, it behaves as normal: a call to zfs destroy myThing destroys it, no questions asked. However, if you manually set the flag to protect the object, it behaves differently. It is still writable/modifiable if the readOnly flag is off, but any call to zfs destroy myThing will abort and error out; something like: "Cannot destroy 'myThing'. Pool/Dataset/Snapshot is protected."

It seems like you are requesting a new command for something that you can do today with a procedure. For example, it is not uncommon for service providers to use a "delete = move to trash" approach, because a zfs destroy is destructive. This is trivially implemented thusly:

  • move to trash: zfs set readonly=on

    zfs set volmode=none # for volumes zfs set canmount=noauto|off # for datasets

  • empty trash: zfs destroy

I can see no justification for a new option to zfs destroy, use the existing features. -- richard

@behlendorf https://github.com/behlendorf An additional flag of "locked" to prevent all admin changes also sounds good. I imagine each can be set independently, but if "locked" is on, it forces "protected" to true?

-Yurelle

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zfsonlinux/zfs/issues/4134#issuecomment-381437382, or mute the thread https://github.com/notifications/unsubscribe-auth/AA08zQ-S4T7Lf2FQwrLPrInvUgnJiHTGks5to7QMgaJpZM4G6HVV.

richardelling avatar Apr 16 '18 15:04 richardelling

@richardelling That doesn't work for snapshots, since we can't set readonly: "this property can not be modified for snapshots". While I know that some sysadmins don't make mistakes, snapshots are, ostensibly, delegate-able to users, and we know that users like to typo commands.

nwf avatar Apr 16 '18 16:04 nwf

I would totally love a protected=true flag I could set on anything I want. Then if I try to destroy it, it would prevent me from doing so. I would have to explicitly remove the flag first.

On my desktop system, I'm often playing around with pools. Creating and destroying them. It's easy for muscle memory to kickin and accidentally type or mistype the wrong dataset name, resulting in destroying the main system's data. Or perhaps you are creating a script that is slightly buggy?

A protected flag would help to alleviate the terror I have anytime I'm working with zfs destroy commands. (I stare at a destroy command for a good 20-30 seconds before executing it.)

chris13524 avatar Jan 09 '19 00:01 chris13524

just throwing it out there but zfs create and destroy, as well as zpool create all have the no-op flag.

bunder2015 avatar Jan 09 '19 00:01 bunder2015

The difference between:

zfs destroy tank/precious

and

zfs destroy tank/precious@snap

Is the difference between my two year old running in and "practicing" his typing while I pause to sip my coffee contemplating if that is actually the command I want to run... Silly....but it absolutely happened.

(There was actually no snapshot on the precious dataset in this case)

Would love to see a protected flag available on the dataset.

@richardelling I'm mostly wanting to protect myself from myself vs an operational procedure. Thoughts?

tcf909 avatar Jun 14 '20 20:06 tcf909

automate it away (DRY)

richardelling avatar Jun 16 '20 00:06 richardelling

I agree with the comment about -f, as this particular flag often prevents people from thinking about what it is they actually asked the machine to do versus what they thought they were asking. I actually think we already have a more zfs/solaris-esque way of doing this, although I have not yet used it, hold for snapshots. Something like zfs hold Pool/dataset might be very nice, and if we type:

# zfs destroy Pool/dataset
Cannot destroy Pool/dataset: dataset is marked as held against destructive operations
# zfs release Pool/dataset
# zfs destroy Pool/dataset

I like how something like this would be lean and clean, and how there's no flag to force it; we explicitly have to run a both seperate and unambiguous command to remove that protection again. By making the command separate, as I feel is a pattern I have seen and liked in illumos, it means at that point, when I hit enter, my mind is thinking about one thing and one thing only, unholding that dataset. Using a flag means that I could go back through my terminal history, edit a command, and accidentally force overwriting a different dataset that I didn't want to, or could mean that we're still thinking about the original command where we were about to damage a filesystem irreperably and the specifics of this, rather than the simple action of "I'm now marking this dataset as okay to be destroyed, but then I must have marked this as not to be overwritten about 9 months aho and can't remember why - what was that reason...?"

However the feature might be implemented, an optional comment on why the dataset is to be marked as not to be destroyed would be fantastic for admins IMO. Using a property on the dataset could also offer all the above features in a different format, I believe (e.g. zfs set nodestroy="read /notes.md because you forgot to copy off the hidden files in the trash when you emptied this back in October" Pool/dataset ; zfs set nodestroy=off Pool/dataset)

I feel an argument that we simply shouldn't run destroy if we're not sure isn't a particularly good one. I recently had a similar issue where I was hoping to make a read-only "copy" of /dev on linux so I could open device files for reading but not writing when I needed to backup my partition tables. I asked about this on the chat for my university's computing society and the only replies I had were that I should use a backup tool (not possible when you need access to raw bytes of course), that dd is nicknamed disk destroyer for a reason and that you should never run anything as root without reading it and checking. I know all of these things, but at 4 AM after 10 hours of sysadmin work, we users inevitably make mistakes in front of a keyboard, and I know myself well in this regard. No matter how much care you take, we all junk a disk from ime to time by accident, somehow.

Our job is often to put in as many sensible measures as we see fit for the job at hand to protect our machine from ourselves, so in my opinion, if a user would like to add another layer of protection because she knows she is likely to make mistakes or errors, perhaps on a system that has not been touched in a very long time, perhaps at a particularly unholy time of day, I think we should absolutely facilitate this if there is little to be compromised by doing so. We have a common policy of offering a program or user as little as they need to get the job done, and nothing more. Unix unfortunately throws this out on many occasions where you are either a normal user called muggins who can do nothing, or become the one all-powerful user named root who cares not for typos or hitting enter too rapidly because the microwave pinged two minutes ago, and who can implicitly do everything, often without being if it's what you really intended to do. We then have ot put in other measures to prevent accidents on a more granular level before we reach this binary decision of normal or everything-including-what-you-didn't-actually-want-this-time.

In my case, I have been writing a script for the last several hours and destroying and creating datasets over and over, necessarily as root, because I cannot mount them in linux without root when I can on illumos. In an ideal world, yes, I would create a pool that I could junk later but I wanted the job done as quickly as possible having spent 12 hours non-stop on this already. I turns out I have both a filesystem called test and a filesystem called temp, and I am also running an SSH session on another machine doing something similar, with temp already containing data I wanted and put there a good while ago; test is, or perhaps, must have been, what I created for the purposes of the script. Right now, I would like to be able to hold on temp such that I don't make any more mistakes.

stellarpower avatar Jul 12 '20 23:07 stellarpower

zfs set org.user.name:nodestroy='this is why you did not want to destroy the dataset or snapshot' pool/volume@snap

bghira avatar Jul 13 '20 15:07 bghira

Polkit, Udev, etc. also came to mind - it might be quite nice to be able to write small functions in something that looks a bit like a scripting language that could be run using hooks before/after executing various commands and allow administrators to customise what happens before and after specific operations according to their own organisational policies. Perhaps a data controller is required by law to keep a copy of data for a minimum of say, 30 days. If it were possible to have a small piece of code to be triggered before deleting a snapshot, it could verify that the snapshot in its arguments is less than 30 days old, and spit out an error, before zfs itself goes and deletes the snapshot, if this isn't the case. In this case, we could easily add a property like the above, and prevent destroying any dataset unless the property is removed first.

stellarpower avatar Jul 30 '20 19:07 stellarpower

Just thinking there's already precedent for ZFS environment variables to control how some commands work (see zpool script code).

What about a variable ZFS_SAFE_MODE which users can export which will make potentially destructive operations act as though "-n" is set?

gdevenyi avatar Jul 31 '20 18:07 gdevenyi

so you can use zed(lets) to micromanage snapshot deletion policy, as i understand it. i have used it for automatically cleaning up old snapshots when a new one is created. though i'm not sure if it can actually deny an operation or reverse one!

@gdevenyi actually a default deny policy for write operations would be kinda useful but i could see that being problematic if one has it set by default in one terminal and goes to another expecting the same restrictions. it is similar to why shell aliases are dangerous.

bghira avatar Jul 31 '20 19:07 bghira

Since the zpool/zfs commands are just open source programs, nothing prevents a user from providing their own that has different policies. For appliances, it is quite common to implement direct calls, like zpool does, from policy elements.

richardelling avatar Jul 31 '20 21:07 richardelling

I found this thread after having run accidentally zfs destroy -r source/photos/2000 instead of zfs destroy -r backup/photos/2000. The backup pool is a directly attached pool I use to regularly zfs send backups to, while source is the source. According to Murphy's Law (of course) I also deleted snapshots from the backup copy, so I had to zfs send the filesystem from a separate backup pool.

How do I prevent shooting myself in the foot with stuff like this?

PS: In reality the pool name of the source is minitank while the backup pool is called microtank.

one-github avatar Oct 31 '21 06:10 one-github

I think zfs is the only FS that has such destructive commands with absolutely no warning and questions. Even deleting all the snapshots etc... Sorry, but thats a big - on zfs currently and that this is an open bug for 6 years....

poelzi avatar Feb 01 '22 15:02 poelzi

Especially because the command to destroy datasets & snapshots are basically identical, only the path is different, and the command to destroy a small narrow thing, includes the command to destroy a broader thing. if you accidentally hit enter part way through typing a command (or copy & paste a command template, and accidentally get a carriage return in the clipboard), you can end up deleting the entire dataset when you only meant to destroy a single snapshot.

zfs destroy myPool/myDataset zfs destroy myPool/myDataset@mySnapshot

yurelle avatar Feb 02 '22 17:02 yurelle

Especially because the command to destroy datasets & snapshots are basically identical

But dataset with children won't be deleted even now, so it's not the case

root@foton:~# zfs create rpool/testds
root@foton:~# zfs snap rpool/testds@snap1
root@foton:~# zfs destroy rpool/testds
cannot destroy 'rpool/testds': filesystem has children
use '-r' to destroy the following datasets:
rpool/testds@snap1

And one more comment to older one:

I found this thread after having run accidentally zfs destroy -r source/photos/2000 instead of zfs destroy -r backup/photos/2000

Unfortunately, even with -imtotallyagreetodeletemydataset word typo of pool name can't be saved, it's independent from any command.

After some years I began to agree with @richardelling here, basic force argument won't help here.

gmelikov avatar Feb 02 '22 17:02 gmelikov