FireMarshal
FireMarshal copied to clipboard
no-disk builds failing over a certain size
--no-disk builds over about 128mb have begun failing. This used to work so it's a mystery as to why it's happening again.
Issue originally reported here: ucb-bar/chipyard#950
Reproduction instructions: use a workload like tests/overlay.yaml and dd a large file (>100MB) into the overlay. Building without no-disk should work, with --no-disk will hang early in the boot process. See the original bug report on chipyard for details.
With older versions of the Linux kernel, the provisional page tables created when virtual memory is first enabled can map at most a certain range, hardcoded to be only 128 MiB by default. If the size of the kernel image exceeds this limit, then setup_vm()
simply BUG()
s out and enters an infinite loop:
https://github.com/firesim/linux/blob/280191c0a6693bce79bec8ef235f58d5f3de4a47/arch/riscv/mm/init.c#L393
Increase MAX_EARLY_MAPPING_SIZE
in boards/default/linux/arch/riscv/mm/init.c
to work around this:
https://github.com/firesim/linux/blob/280191c0a6693bce79bec8ef235f58d5f3de4a47/arch/riscv/mm/init.c#L190
#define MAX_EARLY_MAPPING_SIZE (SZ_1G)
Fortunately, it appears this limitation has been fixed as of v5.12 (https://github.com/torvalds/linux/commit/0f02de4481da684aad6589aed0ea47bd1ab391c9).
It looks like v5.12 and later bring only a partial fix. The VM init code was simplified to statically allocate only a single 2nd-level page table (PMD) for the provisional kernel mappings, effectively limiting the range to 1 GiB (i.e., 512 entries of 2 MiB megapages).
I don't think extremely large initramfs images will be well-supported going forward. To be fair, it is not a common use case on real systems.
I have been talking with @davidbiancolin about bumping the kernel anyway, but there was apparently some problem. I think bumping to 5.14 (currently the latest stable) is the best solution here. I don't know if anyone has that big of an image but I can't imagine it would work well anywhere anyway. At that point I imagine you'd be better off writing a disk model for Spike than fixing Linux.