vm-bhyve icon indicating copy to clipboard operation
vm-bhyve copied to clipboard

Adding a disk to a vm trys a single generated name and fails

Open dnabre opened this issue 3 years ago • 8 comments

The add disk feature (from Issue #7 ) is very handy. Though I've run into an issue, it doesn’t handle conflicts with the generated disk image/zvol/zvol-spare name and existing ones.

Not sure what the expected behavior should be here. The error message might give a touch more info, but I think it’s relatively clear what is happening. Error message:

root@balthazar:~# vm add -d disk -t sparse-zvol -s 2T janet
cannot create 'tank4/vm/janet/disk3': dataset already exists
/usr/local/sbin/vm: ERROR: failed to create new ZVOL tank4/vm/janet/disk3

VM config is attached, it has disks tank4/vm/janet/{janet_root,janet_home,janet_done} attached. There are two previously added disks (tank4/vm/janet/disk{3,4}), which have been removed (commented) out of the config, but their zvols are still there (and data from them is needed).

Working around this issue isn’t hard. Rename the old zvols (disk3,disk4), and things will work fine. If ‘vm add -d disk’ had a flag to provide the disk/image name, it would be an even easier work around.

Simple enough for an admin to work around interactively, but I’m more concerned that scripts or GUI’s built on top on vm-bhyve will have issues here. A user adds a disk, then detaches it without deleting the image, and goes to add another.

As the detachment of the disk from the VM isn’t going through vm-bhyve, it could be argued it’s not it’s responsibility to handle the resulting name conflict. However, vm-bhyve doesn’t provide a way to detach disks from VM through it so: 1) it has to be done outside of it, and 2) in the future, vm-bhyve will likely provide a way to detach disks from a VM. That detach feature should be there eventually, and robust handling of disk-image will be needed.

janet.conf.txt

dnabre avatar Sep 02 '22 16:09 dnabre

So why would you have disks that you comment out and not use? It would seem easier if they weren't commented and then disk5 dataset can get created.

I ran into a different problem after I added a disk, it killed my NIC and added a new one.

GogoFC avatar Sep 17 '22 09:09 GogoFC

So why would you have disks that you comment out and not use? It would seem easier if they weren't commented and then disk5 dataset can get created.

I ran into a different problem after I added a disk, it killed my NIC and added a new one.

I needed to detached the disks and start the VM without them. If there is better/preferred way than removing/commenting them from the config file, I'm unaware of it.

dnabre avatar Sep 17 '22 14:09 dnabre

No I meant why did you need to start the VM without the disks :) not why you did it in that way.

This isn't a bug or a possible feature in my mind because you essentially hidden the disks from config file and so bhyve thinks there's no disks and goes and tries to make disks 3 and 4 again.

You could just uncomment them temporarily and then attach a new disk and then comment them out again, new disk will be disk5.

GogoFC avatar Sep 17 '22 15:09 GogoFC

Somehow didn't see your comment, but managed to acknowledge it as seen somehow... pardon the belated response.

As mentioned in the issue, this issue is mostly a violation of the Principle of Least Astonishment, and more importantly demonstrates how the fragile configuration system can break down even with what it's normal usage is today.

Why did I disable a disk by commenting it out? Because I didn’t want the VM to start with the disk attached. In this case, I was debugging/troubleshooting a network synchronization server that was crashing in an uncontrolled and unpredictable manner (no log showing error or even that system is going down because of unknow error). Different disks had different versions of the data store that the system syncs from/to, with those after this crash being potentially in an invalid state. Part of that debugging was giving it a blank drive to initialize with metadata (vm-add is very handy for that). I had a few versions of the data store that I knew weren’t corrupted, but I wanted to be sure that those weren’t touched by the server unless I was specifically working with them. I have no doubt that there are other ways of handling this situation than attaching/detaching disks form the VM. Maybe some that would be vastly superior. ZFS snapshots might be an answer, but I haven’t used snapshots much with zvols (nevertheless sparse ones that are being feed to bhyve). Changing the how the server starts up and initializes so that I could boot up the VM, and then point it at what mountpoint to use definitely sounds like a possible way of doing it. The current startup doesn’t do that however, and changing code that isn’t related to the issue I’m trying to fix isn’t a good idea in my book. Not to mention, that then I couldn’t just change the setup and launch a new server config, wait for it to fully initialize, and work with it from there. I’d have to start it up, wait to tell it which mountpoint to use (hope I don’t mistype it, and let it manage the wrong one), or alternately insert someway to base that information along from outside bhyve, then wait for it initialize.

While my situation is relatively specific to my circumstances, attaching and detaching drives from a VM has been a feature that many people use and have found useful. In 20+ years of different Virtualization Systems, it’s been a stock feature in all of them. So I don’t think me using those features is an oddity.

vm-bhyve has a feature to create and add a blank drive to a VM (vm add). At the moment, the only way to remove or detach (temporally disable bhyve starting with the disk attached ) is by manually editing the configuration file. To remove a disk without breaking the vm-add feature, you have to modify the configuration for all disks in the system from that disk's number and up (making sure that isn’t a gap in the disk numbers). The ZFS datastore relying on other disk’s number to determine a new disk’s volume name is just the fallout from that. Deciding at this stage of vm-bhyve's development that manually editing the config file after VM creation isn't supported, and may break things. So if you do that, you deal with what problems arise from it, is far from unreasonable.

Alternatively saying that making this work smoothly would require a different and/or more robust configuration system, which is non-trivial, so this is feature request that isn’t important right now.

dnabre avatar Oct 10 '22 16:10 dnabre

I had the same/a similar problem, having disk0 and disk2, but no disk1 in the config. So disk2 was not added to device.map and I was wondering why. It's somewhat confusing.

hurzl avatar Jan 26 '23 12:01 hurzl

I have the same problem as @GogoFC on RH and various clones. If I add a disk it kills networking, adding a new device, and supposedly transferring the IP to that device, though it can no longer reach the network. I wind up fiddling with NetworkManager, sometimes managing to fix it.

scottro11 avatar Feb 05 '23 17:02 scottro11

I have the same problem as @GogoFC on RH and various clones. If I add a disk it kills networking, adding a new device, and supposedly transferring the IP to that device, though it can no longer reach the network. I wind up fiddling with NetworkManager, sometimes managing to fix it.

PR https://github.com/churchers/vm-bhyve/pull/374 from 2020 fixes the networking issue when adding and removing disks but for some unknown reason it was never merged.

shonjir avatar Feb 05 '23 18:02 shonjir

Thanks, hope it gets merged quickly

scottro11 avatar Feb 05 '23 19:02 scottro11