RFCs icon indicating copy to clipboard operation
RFCs copied to clipboard

Nim Roadmap 2025 and beyond

Open Araq opened this issue 1 year ago • 20 comments

Life has a melody, Gaius. A rhythm of notes that become your existence once
they're played in harmony with God's plan. Come. See the face of the shape of things to come.
-- "6" from Battlestar Galactica

NOTE: This plan covers 2025 and beyond as it is very ambitious.

Nil/not nil

The plan for this feature changed again and it is now:

  • ref/ptr T: Remains to be unchecked so that it is backwards compatible with every Nim version ever released. But eventually this will be written as unchecked ref/ptr T.
  • ref/ptr T not nil: Checked at compile-time (we need to fix the bugs in the implementation).
  • nil ref/ptr T: Can be nil and it is checked at compile-time that every deref operation is within a guard like if x != nil.

This feature is not scheduled for any particular release.

Version 2.4

- [ ] Enables strict "definite assignment" analysis which enforces at compile-time that variables have been initialized.

  • [ ] Offers an experimental "type-bound operations" mode.

Version 3

Version 3 will be achieved via a combination of compiler phase rewrites, code reuse, refactorings and porting of compiler code. The primary goal is to finally give us a Nim that offers:

  1. Incremental recompilations.
  2. No forward declarations for procs and types required.
  3. Allow for explicit cyclic module dependencies.
  4. Type-checked generics.
  5. Avoid the phase ordering problems that plagued Nim for a long time: Destructors and other =hooks can be invoked before they have been synthesized successfully which is hard for users to understand.

Implementation

The implementation will use NIF for everything: It is the data format used for communication between the different compiler phases.

Internally most (if not all) phases work on streams of NIF tokens, no tree constructions are required. It is expected that this reduces the amount of memory allocations the compiler has to perform by an order of magnitude.

Arguably a token stream enforces a principled approach to compiler development where by design subtrees cannot be forgotten to be traversed and handled. Roughly a NIF token corresponds to a PNode in the old compiler. A NIF token takes 8 bytes in total, a PNode of today's compiler takes 40 bytes. Therefore it is expected that the new compiler takes 5 times less memory than the current compiler. Since precompiled modules are loaded lazily, this factor should be even higher.

The phases of compilation are:

  1. Pure parsing (nifler): Turn Nim code into a dialect of NIF.
  2. Semantic checking phase 1 (nimony): symbol lookups, type checking, template&macro expansions.
  3. Semantic checking phase 2 (nimony): Effect inference.
  4. Iterator inlining (lowerer).
  5. Lambda lifting (lowerer).
  6. Inject derefs (and the corresponding mutation checking) (lowerer).
  7. Inject dups (lowerer).
  8. Lower control flow expressions to control flow statements (elminate the expr/nkStmtListExpr construct) (lowerer).
  9. Inject destructors (lowerer).
  10. Map builtins like new and + to "compiler procs" (lowerer).
  11. Translate exception handling (lowerer).
  12. Generate NIFC code (gear3).

These phases have been collected into different tools with dedicated names.

NIF

NIF is a general purpose text format designed for compiler construction. Think of it as a "JSON for compilers". NIF has a short precise specification and a Nim library implementation.

While NIF is almost a classical Lisp, it innovates in these aspects:

  1. It uses a separate namespace for tags ("builtins") and source code identifiers. It is most extensible and supports a wide range of abstraction levels. Code that is very high level can be represented effectively as well as code that is close to machine code.
  2. Line information is carried around during all phases of compilation for easy debugging and code introspection tools. The line information is based on the difference between a parent and its child node so that the resulting text representation is kept small.
  3. Declaration and use of a symbol are clearly distinguished in the syntax allowing for many different tasks to be implemented with a fraction of the usual complexity: "find definition", "find all uses" and "inline this code snippet" are particularly easy to implement.
  4. There is an additional format called Nif-index that allows for the lazy on-demand loading of symbols. This is most essential for incremental compilations.

Status: Implemented in 2024. No further changes anticipated.

Nifler

The Nifler tool encapsulates the initial Nim-to-NIF translation step and is generally useful for other tools that want to process Nim code without importing the Nim compiler as a library.

Nifler can also evaluate Nim's configuration system, the nim.cfg files and the NimScript files so that tools like nimsuggest get precise --path information and configuration settings without having to import the Nim compiler as a library.

Status: Implemented in 2024. Minor adjustments required.

Nimony

The primary point of Nifler is to shield "Nimony" from Nim's compiler internals. Nimony is a new frontend for Nim, designed from day one for:

  • Low memory consumption.
  • Incremental compilation.
  • Tooling. The compiler continues after an error and supports "find all usages" and "goto definition" which work much more reliably since generics and macros are type-checked too.
  • Efficient handling of code bases that make heavy use of generics, type computations and macros.
  • Reducing the bug count by orders of magnitute. A "Minimal Redundancy Internal Representation" is used, ensuring that there is only one access path to a piece of data. The internally used data structures cannot get out of sync.

Status: In heavy development.

Lowerer

The lowerer's job is to "lower" high level Nim code to low level Nim code that does not use features such as closures, iterators and automatic memory management. It is planned to only support Nim's ARC/ORC scheme of doing memory management. In the old compiler ARC/ORC was very complex to support as the problem was not as well understood as it is now: A key insight here is to split up the tasks into multiple well-defined subtasks:

  1. Inject dups/copies: This can produce weird constructs like while (;let tmp = f(); tmp.value).
  2. Lower control flow expressions to statements. This means if and case do not produce values anymore.
  3. Inject destructors: Now that values have been bound to temporaries explicitly and the control flow has been simplified it is rather easy to inject =destroy(x) calls at scope exists.

As previously mentioned, the lowerer also does:

  • Map builtins to Nim's runtime.
  • Iterator inlining.
  • Eliminate closures by performing "lambda lifting".
  • Inject pointer derefs and implement "pass by reference".
  • Translate exception handling constructs to NIFC's supported error handling.

Status: In heavy development.

Expander

The expander ("Gear 3") performs backend tasks that need to operate on multiple NIF files at once:

  • It copies used imported symbols into the current NIF file. As a fix point operation until no foreign symbols are left.
  • importc'ed symbols are replaced by their .c variants.
  • importc'ed symbols might lead to (incl "file.h") injections.
  • Nim types must be translated to NIFC types.
  • Types and procs must be moved to toplevel statements.

Status: Implemented in 2024. Major adjustments expected.

NIFC: C/C++ Backends based on NIF

NIFC is a dialect of NIF designed to be very close to C. Its benefits are:

  • NIFC is easier to generate than generating C/C++ code directly because:
    1. It uses NIF's regular syntax.
    2. It allows for an arbitrary order of declarations without the need for forward declarations.
  • NIFC improves upon C's quirky array and pointer type confusion by clearly distinguishing between array which is always a value type, ptr which always points to a single element and aptr which points to an array of elements.
  • Inheritance is modelled directly in the type system as opposed to C's quirky type aliasing rule that is concerned with aliasings between a struct and its first element.
  • NIFC can also produce C++ code without information loss because inheritance and exception handling are directly supported.

Status: Implemented in 2024. Bugfixes required.

Araq avatar Dec 04 '24 09:12 Araq

ref/ptr T: Remains to be unchecked so that it is backwards compatible with every Nim version ever released.

Sad to read that. Why not have some kind of "legacy" mode for code that can't be updated?

All ref/ptr T would be exposed to non-legacy code as nil ref/ptr T, so that we can at least be sure that the non-legacy code is correct.

Also, wouldn't it be possible to automatically convert most ref/ptr T to nil ref/ptr T, with the compiler complaining about places without guards? If so, wouldn't that make it relatively easy to upgrade "nil-unsafe" code to "nil-safe" code?

To me, code correctness seems more important than being backwards-compatible with old, unmaintained code. I also believe that a strong focus on correctness would be better for the growth of the Nim ecosystem (see Rust).

I'd love something that is focused on correctness as Rust, but is as expressive as Nim. Obviously, there would need to be compromises, but having Nil in regular code just for backwards-compatibility shouldn't be one.

d4h0 avatar Dec 04 '24 14:12 d4h0

Sad to read that. Why not have some kind of "legacy" mode for code that can't be updated?

Sure but we need to be able to write code that works with all sort of Nim versions and my proposed way accomplishes that. Changing the default via a mode-switch will come too, but it is much harder to pull off in practice, mostly for bootstrapping reasons.

Araq avatar Dec 04 '24 15:12 Araq

Changing the default via a mode-switch will come too, but it is much harder to pull off in practice, mostly for bootstrapping reasons.

And is it the goal to have Nil-safety as the default in the long run?

I think, most of the benefits of such features are only gained if they are the default.

For example, pattern matching can be implemented in libraries in Nim, but because pattern matching isn't used by a significant amount of the ecosystem, it's not remotely as useful as in languages that have it natively.

d4h0 avatar Dec 04 '24 16:12 d4h0

And is it the goal to have Nil-safety as the default in the long run?

Yes, sure.

Araq avatar Dec 04 '24 17:12 Araq

For legacy code couldn't we default to nil ref/ptr T and have any unchecked deref be a compile warning?

Zectbumo avatar Dec 04 '24 18:12 Zectbumo

I think we continue to need unchecked ref/ptr T for good but we can write it with the explicit unchecked in the long run.

Araq avatar Dec 04 '24 18:12 Araq

Offers an experimental "type-bound operations" mode.

Could you expand on that ? I assume this is related to https://github.com/nim-lang/RFCs/issues/380 . Do you have an idea of which form it's going to take

Clonkk avatar Dec 05 '24 09:12 Clonkk

It's already available via --experimental:typeBoundOps.

Araq avatar Dec 06 '24 09:12 Araq

Yes, sure.

That's fantastic.

I observe Nim already for more than 5 years, and really love many things about it. Unfortunately, there are also a few reasons why I don't use it at the moment.

It seems, most of these issues will be addressed in the near future, which is great.

For me the ideal situation would be Nim + more compile-time guarantees + sum types + pattern matching + better tooling (which all seems to be planned) + an ergonomic (ideally, automatic) way to use Rust libraries.

With the new architecture described above, is there any realistic chance that there will be a new backend that targets the Rust compiler?

If not, is there any chance that Nim will natively support crABI once it is implemented/stabilized?

(crABI is basically a proposal for a new ABI for languages with nice type systems).

Making Rust libraries natively usable would solve one of the biggest issues Nim currently has (there are not many libraries).

To me, it seems, Rust is the only option that offers this opportunity. It is compiled, type-safe, has many libraries, has a build tool that basically all Rust projects use (so should be relatively easy to integrate automatically) - I don't think there is anything else that has all of these properties.

I'm sure, I'm not the only one in the Rust community who would absolutely love to use Nim and leverage the Rust ecosystem.

d4h0 avatar Dec 06 '24 11:12 d4h0

I'm sure, I'm not the only one in the Rust community who would absolutely love to use Nim and leverage the Rust ecosystem.

See also https://github.com/arnetheduck/nbindgen/ - this is mostly seamless for our usage of rust using the "old world" C FFI support - crABI would make a few things a little bit more smooth but it is by no means a blocker.

For compiler-level interop, there's https://github.com/arnetheduck/nlvm which lets you do cross-language LTO between nim and rust (and anything else llvm-based) - this is better than crABI in many aspects, ie basically it's using the LLVM IR as ABI interop layer and can deduce many of the things crABI encodes).

arnetheduck avatar Dec 06 '24 11:12 arnetheduck

Inject destructors: Now that values have been bound to temporaries explicitly and the control flow has been simplified it is rather easy to inject =destroy(x) calls at scope exists.

I've been watching progress in the Mojo language and I'm curious about support for ASAP destruction in a future Nim. How much harder is it to destroy at last usage? What are thoughts on the reported benefits of ASAP destruction over scope based RAII?

bpr avatar Dec 06 '24 15:12 bpr

Inject destructors: Now that values have been bound to temporaries explicitly and the control flow has been simplified it is rather easy to inject =destroy(x) calls at scope exists.

I've been watching progress in the Mojo language and I'm curious about spport for ASAP destruction in a future Nim. How much harder is it to destroy at last usage? What are thoughts on the reported benefits of ASAP destruction over scope based RAII?

Hard real-time apps aren't my area of expertise, but wouldn't you want destruction to be predictable and planned in those cases since it occupies a core? This seems like it would share some of the issues of mark-and-sweep in such apps. Maybe destruction could be ASAP unless there's an explicit call to delete(?)

Spirarel avatar Dec 06 '24 16:12 Spirarel

How much harder is it to destroy at last usage?

Not much harder but I have never implemented it so I could be wrong. Also, the resulting edge cases make me a bit nervous...

What are thoughts on the reported benefits of ASAP destruction over scope based RAII?

Seems like a natural extension but the benefit is marginal for Nim which already has fine grained scopes. Things are different for Python/Mojo where the benefit is larger as Python's scoping rules are terrible.

Araq avatar Dec 06 '24 18:12 Araq

With the new architecture described above, is there any realistic chance that there will be a new backend that targets the Rust compiler?

Compiling Nim or NIFC to Rust seems to be very hard and unrealistic. Instead there should be a NIFC to LLVM translation step, much like nlvm does it today.

That said, I'm more interested in interop with Python than with Rust for the time being.

Araq avatar Dec 07 '24 05:12 Araq

Seems like a natural extension but the benefit is marginal for Nim which already has fine grained scopes. Things are different for Python/Mojo where the benefit is larger as Python's scoping rules are terrible.

I was more interested in the reported benefits with respect to tail recursion and various move optimizations.

Yeah Python scoping rules have always been bad. I wish Guido had copied Scheme, or Pascal, which Scheme copied.

bpr avatar Dec 08 '24 22:12 bpr

Well Nim does elide wasMoved+destroy pairs so it's quite comparable and should also enable more tail recursions. But tail recursion should have a dedicated syntax as relying on it as an "optimization" is too subtle, in my personal opinion. In fact, using recursion for everything is a "principle of least power" violation.

Having said that, Mojo's way of doing it might even be simpler to implement than Nim's wasMoved+destroy elision so I'm taking a closer look...

Araq avatar Dec 09 '24 06:12 Araq

Inject dups/copies: This can produce weird constructs like while (;let tmp = f(); tmp.value)

There has been a significant amount of work done to remove copies, use moves instead and also remove useless initialization (zeroMem), it would be helpful to know what requires a copy or maybe have some metrics to track that.

But tail recursion should have a dedicated syntax as relying on it as an "optimization" is too subtle, in my personal opinion.

I agree. And it would allow using them in Nim in debug-mode without running afoul of the 2000 stacktrace recursion limit.

mratsim avatar Dec 09 '24 09:12 mratsim

it would be helpful to know what requires a copy or maybe have some metrics to track that.

Well the compiler follows the spec, https://nim-lang.org/docs/destructors.html#rewrite-rules but there is also ensureMove and a "cursor optimizer" which can elide copies

Araq avatar Dec 09 '24 10:12 Araq

In fact, using recursion for everything is a "principle of least power" violation.

I agree and would welcome a keyword for explicitly introducing tail call optimization. I'll say TCO rather than "recursion " because IMO most tail self recursions can be written better as loops, and the optimization becomes more valuable when we have mutual (tail) recursion.

bpr avatar Dec 16 '24 18:12 bpr

Update: The "Nimony" compiler offers 4 out of 5 of these:

  • [x] Incremental recompilations.
  • [x] No forward declarations for procs and types required.
  • [ ] Allow for explicit cyclic module dependencies.
  • [x] Type-checked generics.
  • [x] Avoid the phase ordering problems that plagued Nim for a long time: Destructors and other =hooks can be invoked before they have been synthesized successfully which is hard for users to understand.

Araq avatar Mar 28 '25 07:03 Araq

Out of curiosity - if someone is just now beginning to learn Nim, are there any notable "user-facing" concepts in the language that they should note will be changing in the new version of Nim? I'm mostly thinking about a practical day-to-day syntax or functionality, not necessarily lower level "under the hood" concepts.

Bullish-Design avatar Jul 03 '25 00:07 Bullish-Design

Yes, prepare for:

  • Explicit nil ref/ptr T annotations, benefit: no more null pointer crashes at runtime.
  • proc mygeneric[T: <ConceptHere>](...) = ...; benefit: increased compile-time checking
  • case inside object gets a new syntax. Benefit: Enables builtin pattern matching.
  • Reworked async / await that is based on continuations. It mitigates the "what color is your function" problem:
    • Instead of async write passive.
    • Instead of await write nothing at all.
    • Async event loop unified with multi-threading: spawn will be part of the event loop.
  • Reworked exception handling annotations. Benefit: Faster code.
  • The effect system will be opt-in. Benefit: No more "GC safety" related compiler errors.
  • ref will default to use atomic operations so the concept of "isolation" will not exist or not nearly be as important.
  • Macros are replaced by compiler plugins which will offer a different API. (We have no design for an API yet though.)
  • Multi-methods are gone for good, use single dispatch methods.

Overall, the compiler will be more picky in some ways, more lenient in others.

Araq avatar Jul 03 '25 05:07 Araq

Multi-methods

I didn't even know these were supported. For people also interested in some history of the feature: https://github.com/nim-lang/RFCs/issues/65

Spirarel avatar Jul 03 '25 17:07 Spirarel

@Araq - Thanks for that list! It'll be a great help in focusing a learning path. One question regarding macros - will learning the current macro system still be transferable to the future compiler plugin system in a "this is what macros can do" kind of way? I understand that the API might change, but will the resulting functionality for the Nim language be similar?

Bullish-Design avatar Jul 03 '25 18:07 Bullish-Design

but will the resulting functionality for the Nim language be similar?

Yes, very! But already the plugin system is even more powerful and allows for some things that are easier than previously. The reason is that plugins can be attached to a type.

Araq avatar Jul 03 '25 22:07 Araq

Reworked async / await that is based on continuations. It mitigates the "what color is your function" problem:

  • Instead of async write passive.
  • Instead of await write nothing at all.

Can't tell you how glad I am that someone in the language design business has finally figured this out. This has been on my wishlist for ages (for all languages with async IO). Great if this also applies to the JS backend.

~~Personally though, I'd expect to use the word async (rather than passive) to get a future instead of awaiting. Default: "sync" = foreground. Opt-out: "async" = background. Recycles a well-understood word, seems intuitive to me.~~

Edit: apologies for misreading this paragraph. It seems passive is meant to be used in function definitions, and it wasn't mentioned at all how to opt-out of awaiting and get a future instead.

I suppose backwards compat is a consideration, but version 3 seems to be full of breaking changes already. :sweat_smile:


On the topic of breaking changes. If this is already supported, then apologies for my ignorance as I'm only starting out with the language: since packages already specify their preferred language version via .nimble files, the language should be able to change some of its behaviors per-package; for the 2025 plan, this may include:

  • Nilness: opt-out instead of opt-in.
  • await: opt-out instead of opt-in.
  • Possibly more.

Internal versioning of some behaviors is an established practice in Go and Rust. Adds compiler maintenance, but allows to move forward with "breaking" changes without actually breaking existing libraries.

I'm sure Nim authors have already considered this. Backwards compatibility has been mentioned above. Personally, as a total beginner / mere dabbler, I feel like some "breaking" cleanup would benefit the language at this stage; deprecating old behaviors without breaking existing libraries would be a very nice bonus.

mitranim avatar Sep 10 '25 20:09 mitranim