mmtk-core icon indicating copy to clipboard operation
mmtk-core copied to clipboard

Interface for OS- or CPU-specific implementations

Open qinsoon opened this issue 1 month ago • 7 comments

We currently only have util::memory that is designed to be a module that abstracts out the memory-related system calls. Based on the observation of the changes in https://github.com/mmtk/mmtk-core/pull/1418, clearly we have other parts that is OS dependent. We probably want a more proper OS interface.

We could use traits to require each OS to implement certain functions and constants, similar to VMBinding. However, we don't need it as a type argument. Instead, we can just statically include one of such implementations based on the host. Elsewhere In the code base, we use the specific implementation and traits.

For example

trait OperatingSystem {
  type Memory;
  type Process;
}

#[cfg(target_os = "linux")]
mod linux {
  struct Linux;
  impl OperatingSystem for Linux {
    ...
  }
}
#[cfg(target_os = "linux")]
pub use linux::Linux as OS;

Elsewhere in the code

// Hopefully Rust allows us to do the following rather than `<OS::Memory as Memory>::mmap_fixed()`
OS::Memory::mmap_fixed(...);
OS::Process::get_total_num_cpus();

In addition to operating systems, there are parts of mmtk-core that are specific to the CPU, or the combination of CPU and OS.

Similar to the OS, we can have modules specific to the CPU or CPU+OS.

mod cpu {
    trait CPU {
        // insert CPU-specific properties or methods here
    }
    #[cfg(target_arch = "x86_64")]
    mod x86_64 {
        struct X86_64;
        impl CPU for X86_64 { ... }
        ...
    }
    #[cfg(target_arch = "aarch64")]
    mod aarch64 {
        ...
    }
}

qinsoon avatar Nov 27 '25 01:11 qinsoon

See https://github.com/mmtk/mmtk-core/issues/1370

k-sareen avatar Nov 27 '25 01:11 k-sareen

Just observations from https://github.com/mmtk/mmtk-core/pull/1418:

  • We probably want to avoid using posix_memalign -- there isn't an equivalent implementation on Windows. See https://github.com/mmtk/mmtk-core/pull/1418/files#r2566740898. If we do need it, we have to use a special free for such allocations (rather than the general free) to be Windows-compatible.
  • Mmap flags (such as no replace, fixed, etc) should be abstracted into MmapStrategy.
  • MMAP_FIXED_NO_REPLACE is ignored for macOS (no support for it). It is also ignored in https://github.com/mmtk/mmtk-core/pull/1418 for windows, but there could be a proper implementation for the no replace semantic. So for the interface, we may need some methods to be 'downgraded' to be a more relaxed version for some OS impls.
  • Huge page flag is another example. It is only properly implemented on Linux. It can be implemented for some OS, but it is ignored.

qinsoon avatar Nov 27 '25 01:11 qinsoon

@wks Hopefully we won't need too much CPU-specific implementation in mmtk-core. I mean we probably won't need ISA-specific things (x86, ARM, etc.). Things like endianness or pointer width is necessary.

qinsoon avatar Nov 27 '25 01:11 qinsoon

I mean we probably won't need ISA-specific things (x86, ARM, etc.)

You need them if you want to use SIMD instructions for example. That's extremely ISA-specific. For example, @no-defun-allowed's fast bitmap traversal is SIMD-based.

k-sareen avatar Nov 27 '25 02:11 k-sareen

@wks Hopefully we won't need too much CPU-specific implementation in mmtk-core. I mean we probably won't need ISA-specific things (x86, ARM, etc.). Things like endianness or pointer width is necessary.

Probably. Most of the things can be implemented in pure Rust. If we need to optimize for a certain CPU, for example, scanning bit fields, using CPU-specific instructions, and we want to implement it by ourselves instead of using third-party crates, we may specialize those algorithms in the CPU module, and provide a generic implementation for others in Rust.

And there are CPU-specific properties that are important to us (page sizes, for example), but can be queried dynamically at run time from the OS (getpagesize()). We can do that in OS-specific modules.

wks avatar Nov 27 '25 02:11 wks

You need them if you want to use SIMD instructions for example. That's extremely ISA-specific. For example, @no-defun-allowed's fast bitmap traversal is SIMD-based.

We'll have std::simd one day, but my algorithm for computing the offset vector in Compressor uses a carryless-multiply instruction, which I haven't seen any portable wrappers over.

no-defun-allowed avatar Nov 27 '25 05:11 no-defun-allowed

Another concern is that limited by resources (expertise, available machines, license fees, funding, etc.), we are probably not going to maintain all possible targets. But there are researchers and enthusiasts (like in https://github.com/mmtk/mmtk-core/pull/1418) who want to port MMTk to platforms (OS, CPU or both) of their choices.

It will be nice if we allow OS-specific or CPU-specific parts to be implemented outside the mmtk crate (and outside the https://github.com/mmtk/mmtk-core repository). Java calls it a Service Provider Interface (SPI), and many aspects of the Java API, including sound, file systems, database, cryptography, etc., can be implemented by third parties.

Similar to Java, we can make the OperatingSystem and CPU traits public so that users can implement them, and those interfaces can be plugged into an MMTK instance at construction time, either by providing a Box<dyn OperatingSystem> to the MMTK instance or providing a generic parameter <T: OperatingSystem> . I think most of the OS-specific utilities, such as memory mapping, are on the slow path, and the cost of a dynamic dispatching should not be the bottleneck.

See:

  • https://en.wikipedia.org/wiki/Service_provider_interface
  • https://docs.oracle.com/en/java/javase/25/docs/api/java.base/java/util/ServiceLoader.html

wks avatar Nov 27 '25 06:11 wks