rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

First-class compile time type manipulation

Open cjhowe-us opened this issue 1 year ago • 13 comments

Zig supports comptime, which is a keyword that allows for the evaluation of arbitrary expressions and entire functions at compile time. It is also used in Zig's implementation of generics, so that types can be manipulated at compile time to generate code. In particular, Zig supports the following compile time features:

  • Treating types as first-class compile-time values
  • Treating arguments as compile-time arguments as opposed to runtime arguments (that is, their value must be known at compile time)
  • Generating structs/enums at compile time
  • Emitting compiler errors from compile-time contexts, compile-time type validation
  • Manually unroll loops (compile time for) and generate runtime code at compile time

Rust has a few existing features that address compile-time code evaluation:

  • Const fns allow execution of a subset of Rust at compile time when a function is explicitly marked as const
  • Procedural macros allow for execution of code at compile time to manipulate syntactical elements. They exist at compile time but work at a domain above the type system, so are limited in the kinds of analysis they can perform (especially around types)

These are a few of the goals that could be addressed with compile-time code evaluation and type manipulation:

  • Variadic generics could be represented as a list of compile-time types without needing to worry about their runtime representation (including tuple byte layout issues in #376)
  • Reflection can be supported without any runtime type representation necessary
  • Manipulating types with regular code is a lot simpler than trying to generate code using procedural or even declarative macros in many cases
  • Compile-time code can be run to validate preconditions necessary for generated code
  • Optimized or specialized code can be generated using compile time code evaluation, without needing partial specialization, based on arbitrary conditions
  • Contracts
  • Tools (such as linters and IDEs) can run compile time code during development to ensure that types validate and produce correct code
  • Compile-time code could access compiler features that aren't available at runtime (IE, ask the compiler to check for trait conformances)

I have some ideas for how this could be implemented, but ultimately I wanted to build a path towards variadic generics through compile-time code evaluation and generation. There are many features we would need to expand on first before we could work on many of the features listed above, mostly around making const and generics more powerful in Rust. And we would need a way to manipulate types in const fns and return them to be used as generic arguments.

What are the community's thoughts about this as a future direction for Rust?

cjhowe-us avatar Jul 09 '24 04:07 cjhowe-us

Here are the ideas I had for how we could go about implementing this feature by feature.

First, we could start by introducing const params to non-const functions that must be known at compile time. For a const fn all parameters must be known at compile time. For a non-const fn, some parameters may be const and others may be passed at runtime. I suspect this feature can be useful even with the existing const primitives we have today, even if limited. For example:

const myXs: [i32] = [1, 2, 5]

// Standard const fn (all variables and parameters const)
const fn my_const_fn(xs: [i32], y: i32) -> i32 {
  let sum: i32 = 0;
  for x in xs {
    sum += x;
  }
  return sum + y
}

// `my_const_fn(myXs, y)` not allowed unless y is const
// `my_const_fn(myXs, 6)` becomes `return (1 + 2 + 5) + 6 == 14` after compile time evaluation

// Non-const fn with const parameters
fn my_non_const_fn(const xs: [i32], y: i32) -> i32 {
  // Can use const parameters within const blocks
 return const {
    let mut sum: i32 = 0;
    for x in xs {
      sum += x;
    }
    sum
  } + y
}

// `my_non_const_fn(myXs, y)` is valid with non-const `y` and becomes `return (1 + 2 + 5) + y`
// `my_non_const_fn(myXs, 6)` becomes `return (1 + 2 + 5) + 6 == 14`, same as above

The next step would involve allowing the generation of runtime code within compile time contexts. This allows one to put runtime code within an inline const block:

fn runtime_in_const(const xs: [i32], ys: [i32]) -> i32 {
 const {
    for (i, x) in xs.enumerate() {
      runtime {
        if x > ys[i] {
          return x;
        }
      }
    }
  }
  return 0;
}

// runtime_in_const([1, 2], ys) would generate the following specialized function at runtime:
fn runtime_in_const_specialized(ys: [i32]) -> i32 {
  if 1 > ys[0] {
    return 1;
  }
  if 2 > ys[1] {
    return 2;
  }
  return 0;
}

The final step would be to reify types so they can be queried and manipulated in const contexts, and then provide a way to call a const fns on generic arguments to produce new type arguments. I'm still figuring out how this might look, but should start to resemble Zig.

cjhowe-us avatar Jul 09 '24 04:07 cjhowe-us

Reference (not sure useful or not): https://github.com/soasis/rust/blob/ext/basic-reflection/library/core/src/introwospection.rs

SichangHe avatar Jul 21 '24 10:07 SichangHe

This would be closer to what we have (except const generics only takes integers, bool and char):

// Non-const fn with const parameters
fn my_non_const_fn<const XS: &'static [i32]>(y: i32) -> i32 {
    let mut sum = 0;
    let mut index = 0;
    while index < XS.len() {
        sum += XS[index];
        index += 1;
    }
    sum + y
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=376058daa0d54bc0956af2512e3f50ce

SichangHe avatar Jul 21 '24 15:07 SichangHe

Blocking on https://github.com/rust-lang/rust/labels/F-generic_const_exprs https://github.com/rust-lang/rust/labels/F-adt_const_params, it seems.

SichangHe avatar Jul 22 '24 07:07 SichangHe

There are several const-related features that should be handled first to improve the applicability of this feature. Mainly, it would be ideal if we can use existing types in unique ways to generate new types and functions.

To do this, we could make const expressions even more powerful by implementing multi-stage compilation. This would allow const expressions to generate additional stages of compilation by referencing the results of other const expression. That generated const expression is then executed as part of compilation after the expressions it references (directly or indirectly). The subexpression may then generate even more const code to produce additional compilation stages dynamically.

Ideally, we could easily interleave these const expressions with regular functions. Each function would be executed twice: once at the const level to generate all the runtime code, then again at runtime. There are other uses for this: for example, the type checker has to "symbolically execute" each generic function to generate its prototype, passing in any generic type parameters. We could make this more dynamic by allowing the const expressions to dynamically execute code as part of the type checking process.

This requires validation of the const expressions generated at each stage of compilation. Each const expression must not reference variables that are bound at a later stage of compilation. The stage of an expression is simply the highest stage of any variables it references, plus one. This allows it to recursively calculate stages for each level of compilation, and if it encounters a stage that is higher than the stage that references it, the validation fails. If we ensure there are no cyclic dependencies, we should be able to execute the entire graph of const expressions in topological order.

Another important piece will be sandboxing the code so we can ensure it is safe to execute. The simplest idea I have for this is to compile the Rust code to WebAssembly, then use wasmtime to host the sandbox for execution. The advantage of this is that WebAssembly/wasmtime is designed to run untrusted code from remote sources in a safe sandbox. And, since it only needs to run as part of the compiler anyways, it might be acceptable to have limited platform support for now. Platforms not supported by wasmtime could use cross-compilation to compile for any supported Rust target. The wasmtime sandbox would only be required at compile time.

This would all have to come after many other const features in the compiler, such as what was mentioned above.

cjhowe-us avatar Mar 03 '25 17:03 cjhowe-us

we already allow compile-time function evaluation, it is done by compiling the code to MIR which is then executed. we know the semantics of the code because we specify and control the semantics of MIR, so we can bar evaluation of constructs that would need to be sandboxed if we so please. there is no need to introduce a webassembly runtime here.

workingjubilee avatar Mar 03 '25 20:03 workingjubilee

The problem with this approach is performance and scalability, but yes we can do this for now.

cjhowe-us avatar Mar 03 '25 22:03 cjhowe-us

Certainly zig's comptime looks cool. In macros, I'd like to focus on the DSL, and have the compile-time computations be done in const.

https://news.ycombinator.com/item?id=43447324

#[derive(Debug)]
const fn List<T>() -> type {
    return struct {
        items: T[],
        len: usize,
    };
}

fn main() {
    let mut buffer: [i32; 10] = [0; 10];
    
    let list = List::<i32> {
        items: &buffer,
        len: 0
    };    
    println!("{:?}", list);
}

black7375 avatar Apr 05 '25 15:04 black7375

Honestly this feature would be leaps and bounds of useful to me for several reasons. One major usage I have for proc macros for example is features such as serialization and similar needs. Achieving these systems in multiple newer languages is beyond easy and I'd adore if rust supported compile time reflection and compile time loops/ifs/etc to achieve such systems for example.

struct Test {
    a: i32,
    b: f32,
    c: String
}

impl Serialize {
    fn serialize<W: Writer>(&self, wtr: &mut W);
}

impl Serialize for Test {
    fn serialize<W: Writer>(&self, wtr: &mut W) {
        const for field in Test.FIELDS {
            field.data.serialize(wtr);
        }
    }
}

this is a very sloppy example but the idea is there you could even do something like this to make it more powerful but this is quite a change

impl<T> Serialize for T where T::FIELDS.all_have_trait(Serialize) {
    fn serialize<W: Writer>(&self, wtr: &mut W) {
        const for field in Test.FIELDS {
            field.data.serialize(wtr);
        }
    }
}

this feature would remove a ton of my reasoning for needing proc macros honestly.

LoopyAshy avatar Apr 05 '25 17:04 LoopyAshy

+1 to this, honestly Zig's comptime is in my opinion the only thing it really has which is clearly better than rust, also it's a common sentiment online that proc macros are terrible, and although I don't agree with that in general, comptime is certainly much better. Would love to see this.

Batres3 avatar Apr 09 '25 12:04 Batres3

+1, if rust just copied exactly what zig has, that would be awesome! Just call it comptime and have all zig's familiar usage <3

michalsustr avatar Jun 11 '25 16:06 michalsustr

+1, also support @ hasField(T, "field")-like, so Trait can check field of implementator at compile time.

amsitlab avatar Jul 28 '25 05:07 amsitlab

fn runtime_in_const(const xs: [i32], ys: [i32]) -> i32 {
 const {
    for (i, x) in xs.enumerate() {
      runtime {
        if x > ys[i] {
          return x;
        }
      }
    }
  }
  return 0;
}

This example is interesting. It could be picked up by currying/specialization

fn runtime_in_const(const xs: [i32], ys: [i32]) -> i32 {
  for (i, x) in xs.enumerate() {
      if x > ys[i] {
        return x;
      }
  }
  return 0;
}

So the compiler knows xs is const, so it does specialization on const parameters and auto-unrolls the loop and auto-substitutes the const. The const-runtime is manual and enforces it though.

But the choose of const vs runtime seems not quite fit. ideally zig's comptime is opposite of runtime, but rust choose to have const and unlikely to have compile.

Consider re-using dyn keyword, other keywords like emit, or just use a macro emit!

fn runtime_in_const(const xs: [i32], ys: [i32]) -> i32 {
 const {
    for (i, x) in xs.enumerate() {
      dyn {
        if x > ys[i] {
          return x;
        }
      emit! {
        if x > ys[i] {
          return x;
        }
      }
    }
  }
  return 0;
}

Consider quote and splice operations from Scala as well. emit = splice(quote())

JakkuSakura avatar Nov 01 '25 16:11 JakkuSakura