I/O performance in a virtual machine
Describe the bug I am testing the disk I/O performance in a virtual machine and found that both read and write performance have significantly degraded (by several times) compared to the host machine. I would like to know if this is expected and if there are any optimization methods available. Thx~ 我正在测试虚拟机中的磁盘IO性能,发现相对于母机来说,读写性能都变差了许多(数倍),我想问是否符合预期,同时是否有优化的方式。非常感谢
To Reproduce
- Use
vz.NewVirtualMachineto create a virtual machine with specified configurations, and usevz.NewMacOSInstallerto install the virtual machine from an official IPSW file. - For the disk, use
vz.NewVirtioBlockDeviceConfigurationand create an IMG file as the disk viavz.CreateDiskImage. - 使用vz.NewVirtualMachine创建指定配置的虚拟机,通过vz.NewMacOSInstaller,从官方的ipsw文件安装得到虚拟机
- 磁盘使用的是vz.NewVirtioBlockDeviceConfiguration,通过vz.CreateDiskImage创建img文件作为磁盘
Screenshots Disk I/O performance of the local machine (MacBookPro M4) / 本机(MacBookPro M4)的磁盘IO数据
Virtual machine disk I/O performance / 虚拟机的磁盘IO数据
Environment that you use to compile (please complete the following information):
- Xcode version: Xcode 16.4
- macOS Version: macOS15.6
- mac architecture: arm
- Go Version: 1.24.4
Additional context Add any other context about the problem here.
You could experiment with the caching/syncing options:
attachment, err := vz.NewDiskImageStorageDeviceAttachmentWithCacheAndSync(diskPath, false, vz.DiskImageCachingModeAutomatic, vz.DiskImageSynchronizationModeFsync)
You could experiment with the caching/syncing options:
attachment, err := vz.NewDiskImageStorageDeviceAttachmentWithCacheAndSync(diskPath, false, vz.DiskImageCachingModeAutomatic, vz.DiskImageSynchronizationModeFsync)
After adjusting parameters related to cache and sync, no significant improvement was observed. Specifically, the performance loss of concurrent 4K random read/write operations (RND4KQD64) exceeds 50% compared to the host machine. 尝试了调整cache和sync不同的参数,没有明显变化,特别是4K小文件并发随机读写RND4KQD64的性能损耗,对比宿主机在50%以上。
Used the following two configurations:
-
vz.DiskImageCachingModeAutomatic, vz.DiskImageSynchronizationModeFsync:essentially identical. Is this configuration the default one, i.e., the configuration of vz.NewDiskImageStorageDeviceAttachment?
-
vz.DiskImageCachingModeUncached, vz.DiskImageSynchronizationModeNone:essentially identical.
使用了以下两种配置
- vz.DiskImageCachingModeAutomatic, vz.DiskImageSynchronizationModeFsync:数据基本相同,这个配置是否是默认配置,即vz.NewDiskImageStorageDeviceAttachment的配置
- vz.DiskImageCachingModeUncached, vz.DiskImageSynchronizationModeNone:数据基本相同
Do you have any other ideas? Thx!!! 还有其他的思路吗?谢谢~
I never looked at IO performance, I only know these options exist. While at it, you could also test vz.DiskImageCachingModeCached which is the only one you did not test. I expect it would perform better than Uncached
I never looked at IO performance, I only know these options exist. While at it, you could also test
vz.DiskImageCachingModeCachedwhich is the only one you did not test. I expect it would perform better thanUncached
After trying various combinations of cache and sync settings, the random read/write performance in cache mode was closer to that of the host machine. However, it's worth mentioning that noticeable differences occurred when I adjusted the number of CPU cores allocated to the virtual machine. My host machine has a 28-core CPU and 96GB of RAM. When the VM was configured with 4 cores, its random read/write performance was on par with the host. But when the number of cores allocated to the VM was increased to between 12 and 24 cores, the random read/write performance dropped to half of the host's performance.
尝试了cache和sync的多种参数搭配,cache模式随机读写性能上表现更接近宿主机。不过值得一提的是当我调整虚拟机的cpu核数配置时,有了明显变化。我的宿主机配置是28核96G,当虚拟机配置4核时,随机读写性能和宿主机持平;当虚拟机核数提升至12c~24c之后,随机读写性能降低到宿主机的一半。
- host 28c96g
- vm 4c4g automatic
- vm 4c4g cache
- vm 12c 4g cache
- vm 24c 90g cache
@jingshanccc It is a fairly "known" thing within the macOS (Apple Silicon) virtualization scene that (probably due to the NUMA memory layout) the Ultra line doesn't really perform well when it comes to VM performance. I see you are measuring on Apple M3 Ultra. I have the same experience: you can squeeze out the best performance by keeping the number of assigned cores low.
Actually none of the macOS CI providers I know of use Ultra Macs for running VMs. I suggest you to stick to the Pro machines instead.
Anecdotally, you will get the best performance with VZDiskImageCachingMode:cached + VZDiskImageSynchronizationMode:none.
Also, with Tahoe (macOS 26), Apple introduced the ASIF image format which promises better Disk IO than the good old Disk Image format. VZ support is already merged to master, but not released yet. You can find the commit here: https://github.com/Code-Hex/vz/commit/e669237d0ee813976a729143769c1ed35d05ae05
Your host also needs to be Tahoe to be able to use ASIF. I hope it was useful, let me know.
@jingshanccc It is a fairly "known" thing within the macOS (Apple Silicon) virtualization scene that (probably due to the NUMA memory layout) the Ultra line doesn't really perform well when it comes to VM performance. I see you are measuring on Apple M3 Ultra. I have the same experience: you can squeeze out the best performance by keeping the number of assigned cores low.
Actually none of the macOS CI providers I know of use Ultra Macs for running VMs. I suggest you to stick to the Pro machines instead.
Thank you so much! We ultimately suspect that Ultra's chip architecture is causing performance issues. Unfortunately, the pro model does not have the high configuration we need. In the end, I also chose the cache mode, which has a certain degree of optimization effect. I also made preliminary attempts on taohe and asif on ultra, but they did not perform well, but they were not fully tested and I was busy with other work. I will find time to continue to verify them later. 非常感谢!我们最终也怀疑Ultra的芯片架构导致性能表现问题。遗憾的是pro机型没有我们需要的高配置,最终我也选择了cache模式,有一定程度的优化效果。 taohe和asif我当时在ultra上也有初步的尝试,没有较好的表现,但没有全面测试,又忙于其他工作,后续我会找时间继续验证。