firecracker
firecracker copied to clipboard
[Bug] Only up to 64 devices can be used on aarch64
Describe the bug
It is expected that up to 96 devices can be used on aarch64, however If more than 64 devices are attached to an aarch64 microVM, only first 64 are usable.
To Reproduce
- Use an aarch64 machine
- Checkout the repro branch
- Build Firecracker:
./tools/devtool build
- Run the
test_attach_maximum_devices
test:
./tools/devtool -y test -- -vv integration_tests/functional/test_max_devices.py::test_attach_maximum_devices
The observed failure is:
integration_tests/functional/test_max_devices.py:36: in test_attach_maximum_devices
exit_code, _, _ = test_microvm.ssh_iface(i).run("sync")
_ = ''
exit_code = 0
i = 63
test_microvm = <Microvm id=81d12760-4116-4407-9f49-2b3967fbf98a>
test_microvm_with_api = <Microvm id=81d12760-4116-4407-9f49-2b3967fbf98a>
...
host_tools/network.py:93: in _init_connection
raise ConnectionError
E ConnectionError
_ = ''
ecode = 255
self = <host_tools.network.SSHConnection object at 0xffff943f6200>
The test creates a rootfs block device and a number of net devices. When it tries to connect the the last one (which is a 65th device in total), it fails.
Expected behaviour
The test should have passed, because according to the aarch64 layout, it should be possible to use up to 96 devices on aarch64.
Environment
- Firecracker:
27fb303f7f8487bfe6b78db1bc5f6e4b26b456b6
(main) - Host kernel: 5.10, guest kernel: 5.10, however it does not seem to matter
- Rootfs: Firecracker CI Ubuntu 22.04
- Architecture: aarch64
Additional context
Impact: users cannot use more than 64 devices attached to an aarch64 microVM.
Checks
- [x] Have you searched the Firecracker Issues database for similar problems?
- [x] Have you read the existing relevant Firecracker documentation?
- [x] Are you certain the bug being reported is a Firecracker issue?
Hi, is this issue still up? Can attempt a fix.
Hi @dush-t ! Yes, the issue is still valid. Please feel free to take it. Thanks in advance!
Perfect. Wanted to know if this can be tested for on an ARM MacBook, or do I need to get myself a Linux device?
Hi @dush-t, you need Linux KVM to test this.
Hello!
We are students from the University of Texas at Austin taking a virtualization course (cs360v) looking for opportunities to contribute to an open source project for class credit.
Could I be assigned to this?
This is still reproducible.
It is expected that up to 96 devices can be used on aarch64, however If more than 64 devices are attached to an aarch64 microVM, only first 64 are usable.
@kalyazin where is the 96 devices limit defined?
Can this limit be configured by a GIC configuration option in the guest?
Currently, I only see 64 virtio IRQs in the test vm spawned by the test:
cat /proc/interrupts | grep -i virtio | wc
64
cat /proc/interrupts | grep -i virtio
CPU0 CPU1
14: 1259 0 GIC-0 64 Edge virtio0
15: 129 0 GIC-0 65 Edge virtio1
16: 1 0 GIC-0 66 Edge virtio2
17: 1 0 GIC-0 67 Edge virtio3
18: 1 0 GIC-0 68 Edge virtio4
19: 0 0 GIC-0 69 Edge virtio5
20: 0 0 GIC-0 70 Edge virtio6
21: 0 0 GIC-0 71 Edge virtio7
22: 0 0 GIC-0 72 Edge virtio8
23: 0 0 GIC-0 73 Edge virtio9
24: 0 0 GIC-0 74 Edge virtio10
25: 1 0 GIC-0 75 Edge virtio11
26: 1 0 GIC-0 76 Edge virtio12
27: 1 0 GIC-0 77 Edge virtio13
28: 1 0 GIC-0 78 Edge virtio14
29: 1 0 GIC-0 79 Edge virtio15
30: 1 0 GIC-0 80 Edge virtio16
31: 1 0 GIC-0 81 Edge virtio17
32: 1 0 GIC-0 82 Edge virtio18
33: 1 0 GIC-0 83 Edge virtio19
34: 1 0 GIC-0 84 Edge virtio20
35: 1 0 GIC-0 85 Edge virtio21
36: 1 0 GIC-0 86 Edge virtio22
37: 1 0 GIC-0 87 Edge virtio23
38: 1 0 GIC-0 88 Edge virtio24
39: 1 0 GIC-0 89 Edge virtio25
40: 1 0 GIC-0 90 Edge virtio26
41: 1 0 GIC-0 91 Edge virtio27
42: 1 0 GIC-0 92 Edge virtio28
43: 1 0 GIC-0 93 Edge virtio29
44: 1 0 GIC-0 94 Edge virtio30
45: 1 0 GIC-0 95 Edge virtio31
46: 1 0 GIC-0 96 Edge virtio32
47: 1 0 GIC-0 97 Edge virtio33
48: 1 0 GIC-0 98 Edge virtio34
49: 1 0 GIC-0 99 Edge virtio35
50: 1 0 GIC-0 100 Edge virtio36
51: 1 0 GIC-0 101 Edge virtio37
52: 1 0 GIC-0 102 Edge virtio38
53: 0 0 GIC-0 103 Edge virtio39
54: 0 0 GIC-0 104 Edge virtio40
55: 0 0 GIC-0 105 Edge virtio41
56: 0 0 GIC-0 106 Edge virtio42
57: 0 0 GIC-0 107 Edge virtio43
58: 0 0 GIC-0 108 Edge virtio44
59: 0 0 GIC-0 109 Edge virtio45
60: 0 0 GIC-0 110 Edge virtio46
61: 0 0 GIC-0 111 Edge virtio47
62: 0 0 GIC-0 112 Edge virtio48
63: 0 0 GIC-0 113 Edge virtio49
64: 0 0 GIC-0 114 Edge virtio50
65: 0 0 GIC-0 115 Edge virtio51
66: 0 0 GIC-0 116 Edge virtio52
67: 0 0 GIC-0 117 Edge virtio53
68: 0 0 GIC-0 118 Edge virtio54
69: 0 0 GIC-0 119 Edge virtio55
70: 0 0 GIC-0 120 Edge virtio56
71: 0 0 GIC-0 121 Edge virtio57
72: 0 0 GIC-0 122 Edge virtio58
73: 0 0 GIC-0 123 Edge virtio59
74: 0 0 GIC-0 124 Edge virtio60
75: 0 0 GIC-0 125 Edge virtio61
76: 0 0 GIC-0 126 Edge virtio62
77: 0 0 GIC-0 127 Edge virtio63
```
Hi @vliaskov . I believe 96 is inferred from https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/device_manager/resources.rs#L31
gsi_allocator: IdAllocator::new(arch::IRQ_BASE, arch::IRQ_MAX)?,
where
// As per virt/kvm/arm/vgic/vgic-kvm-device.c we need
// the number of interrupts our GIC will support to be:
// * bigger than 32
// * less than 1023 and
// * a multiple of 32.
/// The highest usable SPI on aarch64.
pub const IRQ_MAX: u32 = 128;
/// First usable interrupt on aarch64.
pub const IRQ_BASE: u32 = 32;
This may well be misaligned with what the guest configures. Ideally, if we can only have up to 64 functional devices, we should be failing closely if more devices are requested via API/config to avoid hard-to-debug failures users may observe.
Thanks for the clarification. For anyone following, the logic is described in src/vmm/src/arch/aarch64/gic/gicv3/mod.rs
:
/// Finalize the setup of a GIC device
pub fn finalize_device(gic_device: &Self) -> Result<(), GicError> {
// On arm there are 3 types of interrupts: SGI (0-15), PPI (16-31), SPI (32-1020).
// SPIs are used to signal interrupts from various peripherals accessible across
// the whole system so these are the ones that we increment when adding a new virtio device.
// KVM_DEV_ARM_VGIC_GRP_NR_IRQS sets the highest SPI number. Consequently, we will have a
// total of `super::layout::IRQ_MAX - 32` usable SPIs in our microVM.
let nr_irqs: u32 = super::layout::IRQ_MAX;
let nr_irqs_ptr = &nr_irqs as *const u32;
Self::set_device_attribute(
gic_device.device_fd(),
kvm_bindings::KVM_DEV_ARM_VGIC_GRP_NR_IRQS,
0,
nr_irqs_ptr as u64,
0,
)?;
However, the guest dmesg contains:
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
so there must be something changing the maximum number of SPIs during guest initialization. I haven't figured out what yet.
The logic in linux guest kernel arch/arm64/kvm/vgic/vgic-kvm-device.c
seems to match the one in firecracker:
[...]
case KVM_DEV_ARM_VGIC_GRP_NR_IRQS: {
u32 __user *uaddr = (u32 __user *)(long)attr->addr;
u32 val;
int ret = 0;
if (get_user(val, uaddr))
return -EFAULT;
/*
* We require:
* - at least 32 SPIs on top of the 16 SGIs and 16 PPIs
* - at most 1024 interrupts
* - a multiple of 32 interrupts
*/
if (val < (VGIC_NR_PRIVATE_IRQS + 32) ||
val > VGIC_MAX_RESERVED ||
(val & 31))
return -EINVAL;
mutex_lock(&dev->kvm->arch.config_lock);
if (vgic_ready(dev->kvm) || dev->kvm->arch.vgic.nr_spis)
ret = -EBUSY;
else
dev->kvm->arch.vgic.nr_spis =
val - VGIC_NR_PRIVATE_IRQS;
mutex_unlock(&dev->kvm->arch.config_lock);
return ret;
VGIC_NR_PRIVATE_IRQS
evaluates to 32, so the number of SPIs should be what is expected in firecracker.
A part of VGIC initilization I don't understand (and I don't understand if it's relevant here) is :
12.9.38 GICD_TYPER, Interrupt Controller Type Register
The GICD_TYPER characteristics are:
[...]
•The maximum number of INTIDs that the GIC implementation supports.
ITLinesNumber, bits [4:0]
For the INTID range 32 to 1019, indicates the maximum SPI supported.
If the value of this field is N, the maximum SPI INTID is 32(N+1) minus 1. For example, 00011
specifies that the maximum SPI INTID is 127.
Do src/vmm/src/arch/aarch64/gic/gicv3/regs/icc_regs.rs
and src/vmm/src/arch/aarch64/gic/gicv3/regs/redist_regs.rs
seem to initialize these bits[4:0]
to 0x1B
( gicr_typer = 123
), which could result in 32*(27+1)-1 = 895
SPIs (sorry, I am a rust beginner, learning as I go)?
let gicr_typer = 123;
let res = get_icc_regs(gic_fd.device_fd(), gicr_typer);
let mut state = res.unwrap();
assert_eq!(state.main_icc_regs.len(), 7);
assert_eq!(state.ap_icc_regs.len(), 8);
set_icc_regs(gic_fd.device_fd(), gicr_typer, &state).unwrap();
Anyway this analysis may be out of scope for fixing this test. Let me know if digging deeper is appropriate in the current bug or not. Currently, it seems 96 SPIs is an overestimate of the available SPIs on a Linux guest kernel.
Hi @vliaskov . Thank for sharing your progress. We have not had an opportunity to investigate the issue further, so would still be happy to receive a contribution from the community to address that. Please feel free to post a PR if are able to fix that.