feat: support additional virtualized NVMe disks
Any node for which nvme_size is specified in settings.yml will automatically have one additional virtual NVMe disk created, which will appear as /dev/nvme0n1. This can be useful for testing additional disks with Longhorn. The way it works is a bit obscure because libvirt only seems to support virtio, scsi and sata disks, so we have to resort to hacking qemu arguments in as described in this blog post:
http://blog.frankenmichl.de/2018/02/13/add-nvme-device-to-vm/
This means we have to go create the volume manually first with virsh then pass its path to qemu. Matters are complicated further by volumes being owned by root by default, while libvirt runs VMs as the qemu user. For normal libvirt volumes, file ownership changes happen automatically, but for these volumes we're hacking in by hand, that doesn't happen, so we have to explicitly specify ownership, and to do that we have to define the volume via XML, hence the nasty virsh vol-create-as ... | sed ... | virsh vol-create invocation.
The other wrinkle is that we need to know the exact path to the disk image. Once a volume has been created, you can run virsh vol-path --pool default VOLUME_NAME to get the path, but we need to know the path when setting libvirt.qemuargs, whereas we don't want to actually create the volume until the vagrant up trigger for the node in question. If we create the volume outside the trigger, it gets created on every evaluation of the Vagrant file, even for unrelated VMs... So, we call virsh pool-dumpxml default and get the pool path from that, then stick the volume name on the end.
To test, try something like this in settings.yml:
harvester_network_config:
cluster:
- ip: 192.168.0.30
# [...]
nvme_size: 10G
Does the network interface name change after adding an NVME disk? I have an impression that the ens5/ens6 or something changed after adding the second disk.
Doesn't seem to make any difference. I just gave it a quick test checking ip a in the installer, first without an NVMe:
rancher@rancher:~> ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:71:b5:1c brd ff:ff:ff:ff:ff:ff
altname enp0s5
inet 192.168.121.53/24 brd 192.168.121.255 scope global ens5
valid_lft forever preferred_lft forever
3: ens6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:00:0d:62:e2 brd ff:ff:ff:ff:ff:ff
altname enp0s6
...and second with an NVMe:
rancher@rancher:~> ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:5d:85:07 brd ff:ff:ff:ff:ff:ff
altname enp0s5
inet 192.168.121.66/24 brd 192.168.121.255 scope global ens5
valid_lft forever preferred_lft forever
3: ens6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:00:0d:62:e2 brd ff:ff:ff:ff:ff:ff
altname enp0s6
@m-ildefons you might be interested in this, as I see you're adding multiple disk support in https://github.com/harvester/ipxe-examples/pull/86 :-)
When running the setup script i got this error:
$ ./setup_harvester.sh
~/git/ipxe-examples/vagrant-pxe-harvester ~/git/ipxe-examples/vagrant-pxe-harvester
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match
'all'
PLAY [Setup Harvester] *************************************************************************************************
TASK [debug] ***********************************************************************************************************
ok: [localhost] =>
msg: Installation Started
TASK [install PXE server] **********************************************************************************************
fatal: [localhost]: FAILED! => changed=true
cmd: |-
vagrant up pxe_server
delta: '0:00:00.550344'
end: '2024-11-11 08:27:47.004519'
msg: non-zero return code
rc: 1
start: '2024-11-11 08:27:46.454175'
stderr: |-
error: failed to get pool 'default'
error: Storage pool not found: no storage pool with matching name 'default'
Vagrant failed to initialize at a very early stage:
There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
a syntax error.
Path: <provider config: libvirt>
Line number: 114
Message: NoMethodError: undefined method `content' for nil:NilClass
pool_path = Nokogiri::XML(%x(virsh pool-dumpxml default)).at_xpath('/pool/target/path').content
^^^^^^^^
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>
PLAY RECAP *************************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
I have no default pool on my system:
$ virsh pool-list
Name State Autostart
---------------------------
I would suggest that you add the line # nvme_size: 10G to each predefined cluster node in settings.yaml, so that everyone can see at first glance that there is an optional setting here.
When running the setup script i got this error: ...
Same error as @votdev
@tserong do you have any addtional settings for vagrant to work with virsh?
By the way I think we can have a more general naming cuz the created qcow2 file can also be used as other disk types, not only NVMe
OK, that's weird. I don't think I did anything special to be able to see the default pool... I am in the libvirt group if that makes a difference...?
tserong@ra:~> virsh pool-list
Name State Autostart
-------------------------------
default active no
images active yes
tserong@ra:~> id
uid=1000(tserong) gid=100(users) groups=100(users),108(libvirt),466(docker)
Oh, wait a minute, my ~/.bashrc includes this:
export LIBVIRT_DEFAULT_URI=qemu:///system
I'm going to have to experiment a bit more here. Out of curiosity @votdev @albinsun, what do you see if you run virsh uri (I'm guessing it's qemu:///session). What about if you run LIBVIRT_DEFAULT_URI=qemu:///system virsh pool-list? Does that get a pool list? And, does that command require any additional authentication?
Actually, if you don't mind :-) please try the updated version of this PR which explicitly specifies qemu:///system as the virsh connection URI. This should do the trick, I hope. I tested by manually unsetting my LIBVIRT_DEFAULT_URI environment variable, the when I ran the setup scripts I saw the same failure you both saw, then I applied this latest change and it ran through OK.
Oh, wait a minute, my
~/.bashrcincludes this:export LIBVIRT_DEFAULT_URI=qemu:///systemI'm going to have to experiment a bit more here. Out of curiosity @votdev @albinsun, what do you see if you run
virsh uri(I'm guessing it'sqemu:///session). What about if you runLIBVIRT_DEFAULT_URI=qemu:///system virsh pool-list? Does that get a pool list? And, does that command require any additional authentication?
$ virsh uri
qemu:///session
$ LIBVIRT_DEFAULT_URI=qemu:///system virsh pool-list
Name State Autostart
-------------------------------
default active yes
I'm sorry, it fails again.
TASK [boot Harvester Node 0] **************************************************************************************************
fatal: [localhost]: FAILED! => changed=true
cmd: |-
vagrant up harvester-node-0
delta: '0:00:00.606143'
end: '2024-11-12 09:07:36.016111'
msg: non-zero return code
rc: 1
start: '2024-11-12 09:07:35.409968'
stderr: |-
==> harvester-node-0: Trigger run failed
==> harvester-node-0: A script exited with an unacceptable exit code 2.
A script exited with an unacceptable exit code 2.
stderr_lines: <omitted>
stdout: |-
Bringing machine 'harvester-node-0' up with 'libvirt' provider...
==> harvester-node-0: Running action triggers before up ...
==> harvester-node-0: Running trigger...
==> harvester-node-0: Creating volume vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2
harvester-node-0: Running local: Inline script
harvester-node-0: sh -c " export LIBVIRT_DEFAULT_URI=qemu:///system virsh vol-info --pool=default vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2 >/dev/null 2>&1 || virsh vol-create-as default vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2 --capacity 10G --format qcow2 --print-xml | sed '/<target>/a <permissions><owner>0</owner><group>0</group></permissions>' | virsh vol-create default /dev/stdin 2>&1"
stdout_lines: <omitted>
PLAY RECAP ********************************************************************************************************************
localhost : ok=10 changed=3 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
I'm sorry, it fails again.
All good, at least we got slightly further :-) but it's annoying that there's not more detail in the output as to what's failing now :-/ I'm not sure offhand how to further instrument that trigger script... I don't suppose you could try running this nasty thing directly from a shell and see if it prints anything interesting...?
export LIBVIRT_DEFAULT_URI=qemu:///system ; virsh vol-info --pool=default vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2 >/dev/null 2>&1 || virsh vol-create-as default vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2 --capacity 10G --format qcow2 --print-xml |sed '/<target>/a <permissions><owner>0</owner><group>0</group></permissions>' | virsh vol-create default /dev/stdin 2>&1
I'm sorry, it fails again.
All good, at least we got slightly further :-) but it's annoying that there's not more detail in the output as to what's failing now :-/ I'm not sure offhand how to further instrument that trigger script... I don't suppose you could try running this nasty thing directly from a shell and see if it prints anything interesting...?
export LIBVIRT_DEFAULT_URI=qemu:///system ; virsh vol-info --pool=default vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2 >/dev/null 2>&1 || virsh vol-create-as default vagrant-pxe-harvester_harvester-node-0-nvme0.qcow2 --capacity 10G --format qcow2 --print-xml |sed '/<target>/a <permissions><owner>0</owner><group>0</group></permissions>' | virsh vol-create default /dev/stdin 2>&1
This command succeeds when running in CLI. Will try how it behaves when there are no nodes deployed.
Latest code successfully create nvme0.qcow2 but seems not in hardware list yet.
Does it need to be manually Add Hardware -> Storage?
Tried to manually add disk but fail with