bootloader
bootloader copied to clipboard
Bootloader does not map all required pages
So I have no idea where to put this, though I've spoke about it in various places and have gotten no response in a month or so. (I apologize if I seem impatient.) It doesn't appear like the bootloader is mapping all of the required paged memory regions that my kernel uses. This in turn makes it prohibitive to continue my project because I have no idea why this is even happening to begin with. My debugging has lead me -- repeatedly -- to conclude that its a bug when x86_64 goes to zero an entry in one of the paging structures, though which one I am unclear about. Specifically, it occurs when allocating the kernel heap. When my kernel attempts to allocate address 3000h (or, rather, 18000003000h) I get a #PF, but since (for some reason) my IDT is being ignored (presumably it isn't mapped yet) a page fault occurs. Then another occurs because the ISRs aren't mapped, and then a double fault occurs, and then a triple fault. It has gotten to the point where I'm considering just making my own paging structures and not using the bootloader-provided ones. I want to avoid that but if there is truly no solution to this then I won't have much of a choice. My code is available here. Does anyone know why this is happening? Is there something I'm missing? From what I can tell the bootloader sets CR3 and (appears) to map everything but the behavior of my kernel leads me to believe otherwise.
I believe the intention is that the bootloader provided page tables are just temporary to allow the kernel to run at all. The kernel can then map all pages it needs and unmap those it doesn't need.
That's what my code does, but I don't set up my own paging structures right now (I don't change CR3). Do need to go through that entire process? If so, the blog never covered actually setting up those tables, and I'd need to locate the kernel stack, code/data sections, etc., and map them in, I imagine.
It doesn't appear like the bootloader is mapping all of the required paged memory regions that my kernel uses
Specifically, it occurs when allocating the kernel heap. When my kernel attempts to allocate address 3000h (or, rather, 18000003000h) I get a #PF
The bootloader maps all pages of your kernel code and data segments. Your description sounds like you're choosing an arbitrary virtual address range for your heap. This address range is not mapped by the bootloader because it's not part of a code or data segment of your kernel. Instead, you need to add the required mappings to the page tables yourself.
@phil-opp I should clarify: I don't know where the 3000h is coming from. The 18000000000h part is my page table virtual offset. That is, this is the code that causes the #PF:
// In main.rs
info!("Initializing virtual memory manager");
let rdrand = RdRand::new().unwrap();
let mut start_addr: u64 = 0x0100_0000_0000 + rdrand.get_u64().unwrap();
start_addr.set_bits(47..64, 0);
let mut end_addr = start_addr + MAX_HEAP_SIZE;
end_addr.set_bits(47..64, 0);
libk::memory::init(
boot_info.physical_memory_offset.into_option().unwrap(),
start_addr,
MAX_HEAP_SIZE,
);
// In memory.rs
/// Initializes the memory subsystem.
#[cold]
pub fn init(physical_memory_offset: u64, start_addr: u64, size: u64) {
let mut mapper = MAPPER.lock();
*mapper = Some(unsafe { init_mapper(physical_memory_offset) });
let mut allocator = FRAME_ALLOCATOR.lock();
*allocator = Some(GlobalFrameAllocator::init());
let end_addr = start_addr + size;
match (mapper.as_mut(), allocator.as_mut()) {
(Some(m), Some(a)) => allocate_paged_heap(start_addr, end_addr - start_addr, m, a),
_ => panic!("Memory allocator or page frame allocator failed creation!"),
}
}
/// Initializes a memory heap for the global memory allocator. Requires a PMO to start with.
#[cold]
unsafe fn init_mapper(physical_memory_offset: u64) -> OffsetPageTable<'static> {
// Get active L4 table
trace!(
"Retrieving active L4 table with memoffset {:X}",
physical_memory_offset // is 18000000000h
);
unsafe {
let (level_4_table, _) = get_active_l4_table(physical_memory_offset);
// initialize the mapper
OffsetPageTable::new(level_4_table, VirtAddr::new(physical_memory_offset))
}
}
// ...
/// Allocates a paged heap.
#[cold]
pub fn allocate_paged_heap(
start: u64,
size: u64,
mapper: &mut impl Mapper<Size4KiB>,
frame_allocator: &mut impl FrameAllocator<Size4KiB>,
) {
debug!("Allocating heap in paged memory with start of {:X}", start);
// Construct a page range
let page_range = {
// Calculate start and end
let heap_start = VirtAddr::new(start);
let heap_end = heap_start + size - 1u64;
let heap_start_page = Page::containing_address(heap_start);
let heap_end_page = Page::containing_address(heap_end);
Page::range_inclusive(heap_start_page, heap_end_page)
};
debug!("Page range constructed: {:?}", page_range);
// Allocate appropriate page frames
page_range.for_each(|page| {
debug!(
"Requesting new page frame for page at addr {:X} with size {:X}",
page.start_address().as_u64(),
page.size()
);
debug!(
"Page table indexes: {:?}, {:?}, {:?}, {:?}",
page.p4_index(),
page.p3_index(),
page.p2_index(),
page.p1_index()
);
let frame = match frame_allocator.allocate_frame() {
Some(f) => f,
None => panic!("Can't allocate frame!"),
};
let flags = PageTableFlags::PRESENT | PageTableFlags::WRITABLE;
let frame2 = frame;
debug!("Requesting mapping of page with flags {:X}", flags);
unsafe {
// Below line crashes...
match mapper.map_to(page, frame, flags, frame_allocator) {
Ok(f) => {
debug!("Map complete, flushing TLB");
f.flush();
MUSE.fetch_add(1, Ordering::Relaxed);
}
Err(e) => panic!(
"Cannot allocate frame range {:X}h-{:X}h: {:?}",
frame2.start_address().as_u64(),
frame2.start_address().as_u64() + frame2.size(),
e
),
}
}
});
SMUSE.fetch_add(size, Ordering::Relaxed);
}
I create a random heap location because I (don't) want to clobber any existing memory, and I think that 10000000000h plus a random 64-bit offset should avoid clobbering anything. But this code might be wrong. But I need the start address for my memory allocator (the Rust one as well as the VMM) and so I'm not quite sure how to go about resolving this bug. I could go through the list of (free) memory regions and select the largest one, then pick a random offset in there, but the problem is that the bootloader gives me an invalid memory map (all addresses in hex):
| Start | End | Type |
|---|---|---|
| 0 | 1000 | usable |
| 1000 | 46000 | bootloader |
| 46000 | A0000 | usable |
| 100000 | 800000 | usable |
| 800000 | 808000 | acpi non-volatile |
| 808000 | 810000 | usable |
| 810000 | 900000 | acpi non-volatile |
| 900000 | 1500000 | usable |
| 1500000 | 7BF36000 | usable |
| 7BF36000 | 7BF56000 | usable |
| 7BF56000 | 7E8C7000 | usable |
| 7E8C7000 | 7ECCE000 | usable |
| ... | ... | usable |
| 7FAEF000 | 7FB6F000 | reserved |
| 7FB6F000 | 7FB7F000 | acpi reclaimable |
| 7FB7F000 | 7FBFF000 | acpi non-volatile |
| 7FBFF000 | 7FE00000 | usable |
| 7FE00000 | 7FEF6000 | usable |
| ... | ... | usable |
| 7FF78000 | 80000000 | acpi non-volatile |
| 100000000 | 140000000 | usable |
| 140000000 | 1400A6000 | usable |
| 1400A6000 | 580000000 | usable |
| B0000000 | C0000000 | reserved |
| FFE00000 | 100000000 | mmio |
As you can see, the map is littered with no loader code/loader data, boot services code/data, runtime services code/data, etc. The UEFI specification mandates that all UEFI implementations MUST expose UEFI runtime services in the memory map, and, assuming that the bootloader isn't doing any trickery with it behind the scenes (which if its not if I'm understanding the code correctly), this is the exact same memory map given to the bootloader by EDK II. And EDK II is UEFI spec compliant, so therefore the bootloader must be doing something wrong. Since the bootloader doesn't give me any strong enumeration variants to work with, I am forced to try to interpret the enum values by hand:
for region in boot_info.memory_regions.iter() {
info!(
"| {:X} | {:X} | {} |",
region.start,
region.end,
match region.kind {
MemoryRegionKind::Usable => "usable",
MemoryRegionKind::UnknownUefi(kind) => match kind {
0 => "reserved",
1 => "loader code",
2 => "loader data",
3 => "boot services code",
4 => "boot services data",
5 => "runtime services code",
6 => "runtime services data",
7 => "Usable",
8 => "unusable",
9 => "acpi reclaimable",
10 => "acpi non-volatile",
11 => "mmio",
12 => "port mmio",
13 => "pal code",
14 => "free nvm",
_ => "unknown",
},
MemoryRegionKind::UnknownBios(_) => "unknown",
MemoryRegionKind::Bootloader => "bootloader",
_ => "Unknown",
}
);
}
So, to rap this up: what is the best solution to this problem? And how would I go about fiddling with the page tables? I thought that the heap allocation routine was supposed to do that.
Update (just commenting again so people using emails see this): the other issue is that I don't really see any other way of passing the strong enum values to kernels. I don't think you can do that over an FFI boundary, but at the same time you shouldn't ever try to (manually) interpret the memory map types since an implementation can use any value they want for any of them, excluding those that the UEFI spec explicitly marks as reserved. So this feels kinda like a catch-22: the bootloader has to tell the kernel what memory region is of what type, but the bootloader can't pass strong Rust enums over an FFI boundary to the kernel, but the kernel can't just interpret a particular memory type value as "free" or whatever because the hard-coded values may not even remotely match the UEFI implementations values (say, an implementation could use 1000h for MMIO regions).
Okay, so... This I don't get. In the bootloader we have this code:
use uefi::table::boot::MemoryType as M;
match M(other) {
M::LOADER_CODE
| M::LOADER_DATA
| M::BOOT_SERVICES_CODE
| M::BOOT_SERVICES_DATA
| M::RUNTIME_SERVICES_CODE
| M::RUNTIME_SERVICES_DATA => MemoryRegionKind::Usable,
other => MemoryRegionKind::UnknownUefi(other.0),
}
Some of these items are not in fact "usable" unless you want the OS to be unable to use any runtime services at all. The loader code/data I get (though I wonder if that might point to part of the bootloader) but definitely not the runtime services. Is there a reason we mark that as usable and don't just pass the value directly?