xcp icon indicating copy to clipboard operation
xcp copied to clipboard

Nested Virtualisation (XCP-ng hosting other virtualized hypervisors) doesn't work with Hyper-V

Open lurendrejer opened this issue 6 years ago • 38 comments

I've been fiddling with nestedHVM and HAP via cli for quite some time, without much luck. Thinking that I was the problem, I told no one.

Now that XenOrchestra supports these options (enable nested virtualisation) - and it still doesn't work i was thinking that something might be up with XCP-ng.

XCP-ng 7.6 CPU= X5670 (supports SLAT) Hardware platform: HP BL640c G7

Installing Hyper-V and creating a new VM gives me the following error: image

I've tried different CPU's XCP-ng 7.5 Different VM's

Without any luck. A nested VMware gives the same (unable to start vm's) error and just freezes up.

lurendrejer avatar Dec 03 '18 12:12 lurendrejer

Please double check your BIOS have all the virt options enabled.

olivierlambert avatar Dec 03 '18 13:12 olivierlambert

Just did, thank you. Everything is enabled. Virtual extensions, VT-D, etc.

image

lurendrejer avatar Dec 03 '18 13:12 lurendrejer

Does XCP-ng displays an error message when you try to install it in a nested VM? (it should tell you it can use HVM in the installer screen).

Also, double check you are using (in the nested VM) the same amount of RAM (dynamic min = dynamic max = static max)

olivierlambert avatar Dec 03 '18 14:12 olivierlambert

Will try XCP tomorrow. Thank you.

Ram is static.

/Mobile

lurendrejer avatar Dec 03 '18 14:12 lurendrejer

argh, fudge.... I made a quick test, I can't create new vm's before my 7.5 to 7.6 upgrade is done.

I'll se if I can get'r'done tomorrow.

I tried starting the installer on an existing VM, it got to where i select the install disk. But the disk doesn't have enough space (32gb).

Sorry.

lurendrejer avatar Dec 03 '18 18:12 lurendrejer

well, since we don't have mission critical servers like most other companies - i went ahead and upgraded the whole pool via http. Everything is working - i'm installing the xcp-VM now.

lurendrejer avatar Dec 03 '18 19:12 lurendrejer

Well, it works with XCP-ng. Which is nice, but since we strive to educate our students in multiple hypervisors - this is a problem.

lurendrejer avatar Dec 03 '18 19:12 lurendrejer

And another thing - why the heck doesn't xcp-ng come with xcp-tools preinstalled? :) It only works with intel e1000 emulation, realtek emulated cards stays disconnected after bootup.

lurendrejer avatar Dec 03 '18 20:12 lurendrejer

  1. I can't answer why HyperV and VMWare cannot work in a nested XCP-ng situation. This is outside my domain of expertise. Does KVM works?
  2. No VM Tools in XCP-ng: because… it's meant to run on bare metal and not in a VM? :wink:

olivierlambert avatar Dec 03 '18 22:12 olivierlambert

It would seem that the hyper-v role tries to update the CPU's microcode. It works with vmwares virtual hardware version 11, not 10 - so i guess this could be a Xen/Qemu problem or a hardware issue.

// https://communities.vmware.com/thread/525611

quote:: One interesting thing to note is that the log file indicates that Windows 2016 tried to update your CPU's microcode patch level from 0x70b to 0x710. ESX does not allow a virtual machine to update the microcode of the CPU. I'm speculating here, but it's possible that Hyper-V will not start the hypervisor in the presence of an erratum fixed by microcode patch 0x710 (erratum BT248).

lurendrejer avatar Dec 04 '18 06:12 lurendrejer

Yes, this is probably the reason. If you want more insights, you can always post on the Xen mailing list, because it's a very specific and "low level" Xen question :)

olivierlambert avatar Dec 04 '18 08:12 olivierlambert

Does this make any sense to you when it comes to XCP? https://wiki.xenproject.org/wiki/Nested_Virtualization_in_Xen

as far as i can see, commit:58f5bcaf solves the problem.

lurendrejer avatar Dec 05 '18 11:12 lurendrejer

  1. As explained in the doc, you can't use L1 guest in PV, you must use HVM (as I expected)
  2. To boot HyperV or VMWare, it seems you need to mask the CPU, eg cpuid = ['0x1:ecx=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'] But this is a Xen setting, IDK if you can pass it via XAPI in XCP-ng

olivierlambert avatar Dec 05 '18 14:12 olivierlambert

oh, I'll give it a whirl. You are right, this must be a L1 -setting, I misunderstood. There was some change regarding forced pool cpu-masking around 6.5-7.0, irrc.

lurendrejer avatar Dec 05 '18 15:12 lurendrejer

This is a poopshow, I'm giving up - I've been banging my head against the wall with xe param-set, xapi and very last, but not least, poorly formatted XCP-template-xml-export-files.

In my struggle to set the CPU mask, i noticed that you set the exp-nested-hvm flag when changing nested virtualisation in XOA, not HAP and NestedHVM - why is that?

Both does seem to work, but I just figured the the EXP-Nested-HVM flag would be depricated when it is no longer experimental. :)

lurendrejer avatar Dec 05 '18 17:12 lurendrejer

This is a long story. You can find "guides" for Xen (the hypervisor) but XS/XCP-ng aren't just Xen but the whole thing built around. That's why I said I have no idea how to pass cpuid. If it's not exposed in XAPI, then you are doomed to dig for days on how to do so.

olivierlambert avatar Dec 05 '18 17:12 olivierlambert

Hi, and thank you.

Before the feature was added to XOA, I tested with xe param-set. Adding NestedHVM and HAP worked just like the Exp-nested flag. I just figured that EXP-nested would be removed from Xen one day.

/edit What I was trying to say is: NestedHVM and HAP can be set via Xapi :)

lurendrejer avatar Dec 05 '18 17:12 lurendrejer

And I think, that someone more skilled than me in the art of XML - would be able to create a XCP-template with the CPU-mask included.

lurendrejer avatar Dec 05 '18 17:12 lurendrejer

Maybe there is an option is "other vm param" of VM object that's read by Xen, try to dig on Google.

olivierlambert avatar Dec 05 '18 17:12 olivierlambert

CPUID can be added anywhere, without any sign of verification that it was the right place. The following doesn't work, and i could be trying more or less forever since XEN just accepts any given parameter. :)

xe vm-param-list uuid=3ea123c5-f36f-93f3-6979-802cc57c6dcd|grep cpuid

platform (MRW): timeoffset: 0; exp-nested-hvm: true; cpuid: ['0x1:ecx=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx']; videoram: 8; hpet: true; device-model: qemu-upstream-compat; apic: true; device_id: 0002; cores-per-socket: 2; pae: true; vga: std; nx: true; viridian_time_ref_count: true; viridian: true; acpi: 1; viridian_reference_tsc: true

lurendrejer avatar Dec 05 '18 17:12 lurendrejer

Maybe something you should look into: the article at (https://wiki.xenproject.org/wiki/Nested_Virtualization_in_Xen) states that HAP should be set to 1 - to avoid poor performance on the nested hypervisor.

I don't know if the flag you set en XO (exp-nested-hvm) sets both of these options, I wouldn't even know how to test it either.

lurendrejer avatar Dec 05 '18 18:12 lurendrejer

Again, Xen is the engine. We don't really have control on what's passed to the hypervisor except via XAPI. Using the nested flag should enable everything in the hypervisor parameters on VM boot.

olivierlambert avatar Dec 05 '18 21:12 olivierlambert

Sorry to bump an old issue, however, it seems too me that because of nested virtualisation issues, WSL2 on Win 10 does not work (WSL1 works). When I try, for example, import a distro as WSL2, I get the following error (same thing happens when I try to install a fresh distro image):

I have enabled nested virtualisation, also in BIOS I have enabled virtualisations

> wsl --import centos_7_v2 "$env:userprofile\centos_7_v2"
  "$env:userprofile\Desktop\centos_7.tar" --version 2
Please enable the Virtual Machine Platform Windows feature and ensure virtualization
  is enabled in the BIOS.
For information please visit https://aka.ms/wsl2-install

Also note that I have enabled all required Windows features, even Hyper-V, but that error still persists.

Is there anything I could do in order to make WSL2 work? Thanks. :smiley:

tukusejssirs avatar Oct 28 '20 20:10 tukusejssirs

I also am trying to figure out how to get wsl2 working in a windows xen guest.

Any updates?

michael-newsrx avatar Feb 16 '21 14:02 michael-newsrx

OK, the VMs get ACPI devices, but Hyper-V requires an APIC and not an ACPI...

michael-newsrx avatar Feb 16 '21 16:02 michael-newsrx

See: https://docs.microsoft.com/en-us/troubleshoot/windows-server/virtualization/vmbus-device-not-load

michael-newsrx avatar Feb 16 '21 16:02 michael-newsrx

Thanks, @michael-newsrx!

Thanks to your link I have found this website that I can run (as an admin) bcdedit /set detecthal true in order to enable the HAL detection.

Note that the website I linked above states (implicitly) it should not work on Win 10, however it worked on Win 10 x64 2004 Build 19041.746.

Also note that the Microsoft documentation navigates us using msconfig, but I have no Detect HAL option in there.

Update: Could this setting be set in XCP VM configuration? It’d be awesome! :wink:

tukusejssirs avatar Feb 18 '21 10:02 tukusejssirs

Hrm.. I ran the bcdedit command, rebooted, but it still shows ACPI and not APIC ?

Do I need to do something to the VM's configuration via xe?

michael-newsrx avatar Feb 18 '21 13:02 michael-newsrx

Actually, neither I have APIC entry in devmgmt.msc; see below.

Anyway, I have just noticed that there is warning emblem on Microsoft Hyper-V Virtual Machine Bus Provider with the following warning:

Windows cannot initialise the device driver for this hardware. (Code 37)

The request is not supported.

xcp_hal_acpi_acip_wsl2

I have no idea how could I solve this.


On the other hand, I successfully run wsl --set-default-version 2 command. Also I created a test VM in Hyper-V Manager (with no vHDD with a virtual DVD drive with Win 10 installation ISO inserted) and started the machine–unsuccessfully. A warning (different one from the one in the OP); see below. I think The issue is the same here as in the devmgmt.msc.

hyper_v_error


Update:

Similar issue for VMware: https://communities.vmware.com/t5/VMware-Fusion-Discussions/Driver-conflict-MS-Hyper-V-Virtual-Machine-Bus-Provider/td-p/939114

BTW, I’ve just tried to uninstall Microsoft Hyper-V Virtual Machine Bus Provider and scanned for hardware changes in devmgmt.msc and now it disappeared and did not re-appear.

Then I restarted the VM, removed Hyper-V from system features, restarted the VM, installed Hyper-V into system features, restarted the VM. It didn’t help to solve the issue.

Also, based on a comment from @lurendrejer, I double-checked if I have set apic: true in platform. It is set to true.

tukusejssirs avatar Feb 18 '21 14:02 tukusejssirs

Any updates on this?

michael-newsrx avatar Oct 17 '22 15:10 michael-newsrx