bootloader Decrease Kernel Stack Size

The bootloader currently sets up huge stack for the kernel by default (512 pages = 2MB). Now that the stack size is configurable, we should probably reduce the default.

cc @64

Sep 14 '19 07:09 phil-opp

How many pages do you think is a good default? I can do a bit of investigation to see how much is being used.

Sep 14 '19 16:09 64

I'm not sure honestly. According to https://www.kernel.org/doc/html/latest/x86/kernel-stacks.html, Linux only uses a 8KiB stack, so we can definitely reduce it considerably.

Sep 14 '19 17:09 phil-opp

On the blog_os/post-09 branch, 5 pages is the minimum that the kernel can use without crashing (in release, this number is 3). So maybe 10 or 16 would be a reasonable number?

Sep 17 '19 00:09 64

Sounds reasonable.

I did a quick experiment using the -Z emit-stack-size flag and the stack-sizes crate:

Debug:

Address	Size	Function
0x000000000000ad90	8280	`spin::once::Once<T>::call_once::h51bca0b20078dc27`
0x0000000000022100	4200	`x86_64::structures::idt::InterruptDescriptorTable::new::hd32c5c9dadb04897`
0x000000000001a830	4184	`core::ops::function::FnOnce::call_once::hee973ff0694526b7`
0x000000000001aa40	2776	`<pc_keyboard::KeyCode as core::fmt::Debug>::fmt::hda7c4f2f9fb20e82`
0x0000000000006ee0	1304	`blog_os::kernel_main::hc3a3a2c983130cd8`
0x0000000000003960	832	`blog_os::interrupts::page_fault_handler::h204c58ca3e079357`
0x0000000000013900	728	`core::fmt::Formatter::pad::h627264a917b3c4e5`

Release:

Address	Size	Function
0x0000000000003de0	8248	`spin::once::Once<T>::call_once::h81ce430b91fc8bc5`
0x00000000000069a0	3720	`x86_64::structures::idt::InterruptDescriptorTable::new::hff84b1851a4e5332`
0x0000000000004040	272	`blog_os::interrupts::page_fault_handler::hca19ad914c532427`
0x0000000000002970	264	`blog_os::kernel_main::h19c30b4a6b5987ca`

So the Once type and the creation of the IDT is responsible for most of the stack use.

Sep 18 '19 08:09 phil-opp

I tried to initialize the IDT without lazy static. Instead I used a Mutex protected static:

static IDT: spin::Mutex<InterruptDescriptorTable> = spin::Mutex::new(InterruptDescriptorTable::new());

This way, the large IDT is initialized at compile time and does not need to be constructed on the stack. To be able to load it, I needed to add a custom lock_leak function to the Mutex type, which locks the mutex indefinitely and returns a &'static mut reference.

With this approach, the stack sizes are much smaller:

Debug:

Address	Size	Function
0x0000000000014210	2776	<pc_keyboard::KeyCode as core::fmt::Debug>::fmt::hda7c4f2f9fb20e82
0x0000000000008280	1416	blog_os::kernel_main::h7f109f414e18afd3
0x000000000000cc00	896	blog_os::interrupts::page_fault_handler::hf82cf137aee3c1b6
0x0000000000005400	728	alloc::raw_vec::RawVec<T
0x000000000000a880	568	<core::slice::Iter<T> as core::iter::traits::iterator::Iterator>::try_fold::ha720e2a3fd963112

Release:

Address	Size	Function
0x0000000000004130	272	blog_os::interrupts::page_fault_handler::h98b3318d6c4ff705
0x0000000000002960	264	blog_os::kernel_main::h44cfb55497ed624e
0x0000000000003bc0	248	spin::once::Once<T>::call_once::h6653ebe0f1309568
0x00000000000043a0	224	blog_os::interrupts::double_fault_handler::hef2e5697b10cc94b
0x0000000000004040	216	blog_os::interrupts::breakpoint_handler::h8a0b1ee731e065af
0x0000000000003cb0	200	spin::once::Once<T>::call_once::hd8582ddc822ac52c
0x0000000000006e40	200	x86_64::registers::control::x86_64::<impl x86_64::registers::control::Cr3>::read::h825613083c74e7df

I managed to run the release binary with kernel-stack-size = 1.

Sep 18 '19 08:09 phil-opp

That’s excellent! I saw that crate too, but it seemed like it was going to be annoying to set up. Did you have to modify the linker script at all to get it to work? It would be good to have some instructions somewhere on how to do this; I’m happy to write something up.

Sep 18 '19 14:09 64

It was definitely a bit annoying to set up. I also had some problems with using the cargo stack-sizes command directly, I assume because of cargo-xbuild.

The steps that worked for me were:

Create a linker script to preserve the .stack_sizes ELF section:

/* file: keep-stack-sizes.x */
SECTIONS
{
  /* `INFO` makes the section not allocatable so it won't be loaded into memory */
  .stack_sizes (INFO) :
  {
    KEEP(*(.stack_sizes));
  }
}

Run RUSTFLAGS="--sysroot /path/to/your/project/target/sysroot -Zemit-stack-sizes" cargo rustc --bin blog_os -- -C link-arg=-Tkeep-stack-sizes.x -C link-arg=-N to create an ELF file with a .stack_sizes section
- Requires that you do a normal cargo xbuild before to create the sysroot.
Install the stack-sizes crate: cargo install stack-sizes
Run stack-sizes target/x86_64-blog_os/debug/blog_os > stack-sizes.csv
Sort the CSV file by size column. I used LibreOffice Calc for this, but any CSV tool should work.
Repeat the steps with --release and target/x86_64-blog_os/release/blog_os if you like.

Sep 18 '19 17:09 phil-opp

@phil-opp Hey, I know this is way above my paygrade currently, but would you be so kind to elaborate on the lock_leak function you talked about earlier? The stack usage difference it causes is pretty big, so it'd be cool to talk about it in another part regarding interrupts on the os blog, if that'd be at all possible.

Thanks :)

Oct 20 '19 14:10 L3tum

@L3tum Sorry for the late reply! I definitely plan to implement some solution to this problem for the blog, I'm just not sure about the best approach yet.

Regarding the lock_leak function: To load the IDT, we need a 'static reference to it. The normal lock function, however, only returns a reference that lives as long as the static variable is borrowed, i.e. only as long as the init_idt function runs. To get a 'static reference from the Mutex, we need an additional function that locks the lock indefinitely and returns a 'static reference (which I called lock_leak above).

Note that this is only one approach for solving this problem. Alternatively, we could try to make the set_handler function a const function, so that we could directly initialize it at compile time. If that's not possible, it might be preferable to create a new type instead of adding the lock_leak function in order to prevent accidental deadlocks. For example, we could create a type that hands out a single &'static mut reference and panics on subsequent calls.

Nov 02 '19 22:11 phil-opp

bootloader bootloader copied to clipboard

Decrease Kernel Stack Size

bootloader
bootloader copied to clipboard