Stakes in the ground
It seems I write these issues these days instead of blog posts. Very well.
This issue is a self-closing one about the new spec I'm writing. Oh right, I'm writing a new spec, check it out! (At the time of writing, I'm up to about chapter 6 or so.)
Anyway, I realized that there were some "soft" topics I wanted to write about, that are more about design sensibilities and "taste" — stakes in the ground — than about the objective things that go in the spec itself. Subjective stuff, basically. But somehow rooted in reasoning, or at least I'd like to think so.
I'll stub out the subsequent sections as individual comments, and then fill them in. After that, I'll close this issue.
Variable scope
Alma uses the my keyword to declare variables. Besides being the variable-declaring keyword of choice for both Raku and Perl, I also quietly dig its suitably down-to-earth directness. Whose variable? My variable! G'doy!
Seriously, though, I could've gone with let. Definitely my second favorite.
Languages like C, C++, Java, C# use the type (like int) in lieu of a dedicated variable-declaring keyword. That's fine, I guess, but it ties your language into being one of those "static" straitjacket languages, which Perl, Raku, and Alma are not. Alma chooses not Raku's "type second" approach (my Int $n), but instead TypeScript's "type after colon" approach (my n: Int).
Python and Ruby do not deign to have either keyword or type. They just assign to the new variable, and this assignment also causes it to be declared. I was going to say something dramatic, like "that is Wrong". But Python and Ruby are rather popular, and so clearly a language can reach a respectable level of popularity without explicitly declared variables. All I can really say with certainty is that either I'm wrong, or the unwashed masses are.
Ancient-enough versions of C only allowed you to declare local variables at the top of functions. This was, as I understand it, a kind of torturing the programmer to benefit the compiler writer: a function gets a stack frame of a certain size, a size which grows with each new variable declaration. By putting all the declarations at the top, one does not have to recalculate that size halfway through the function.
Which brings us to the travesty that is JavaScript's var declaration.
tbd
Hoisting
The classical "folk explanation" of hoisting is pretty dumb. Not because it's super-incorrect — it actually tends to be helpful — but because it's one of those explanations that makes a big deal out of a simple thing, and then fails to explain stuff when it gets complicated.
Example of when it succeeds: you can call a function that is "not declared yet".
f(); // this call works
function f() {
console.log("whoa!");
}
The reason this works, says the dumb explanation, is that the function f declaration gets "hoisted" (by its own petard, one presumes) and ends up physically at the top of its surrounding scope, so that the code effectively looks like this:
function f() {
console.log("whoa!");
}
f();
Even though that's not what you wrote.
Example of when it fails: two functions can call each other.
function isEven(n) {
return n === 0 || isOdd(n - 1);
}
function isOdd(n) {
return isEven(n - 1);
}
There's no way to use the "hoisting" imagery in this case to show how the code gets rewritten, because you can't physically put two functions above each other.
tbd
Initialization
tbd
Thunking
tbd
Type conversions
tbd
Object model
tbd
xxx Every language has an object model that looks slightly different; C++, Perl 5/Moose, Python, JavaScript, C# — frustrating! Is there a "truth" about objects? a way to model them without falling prey to xkcd's there are 15 competing standards ?
xxx Narrowly deciding to use has as a field declarator keyword instead of my; mainly because a different set of annotations are used on fields. (Also because has ends up having a different scope than my; this, I believe, was a major reason in Raku as well.)
xxx Also, yes, having a declarator keyword at all feels better than just letting the identifier be the declarator (as in JS/TS and Python); same argument leads to case as a declarator for enums
Generic functions
Functions are pretty powerful before we even get to generic functions. Here are the modular (but on-by-default) features you get in ordinary functions:
- Optional parameters. A function call can succeed without the corresponding argument being passed. Whereas usually, a parameter is required, if it's declared as optional and then its corresponding argument is not passed, the parameter will have the value
none. - Default expressions. Instead of binding to
none, an optional parameter can be equipped with a default expression. If the corresponding argument is not passed, the default expression is evaluated, and the parameter is bound to the result. - Rest parameter. In the case of excess arguments being passed, these can be collected in a dedicated rest parameter. This parameter will always bind to an array, but the array may or may not be empty, and it doesn't have an element type by default.
- Named parameters. Besides the usual ("positional") parameters, it's also possible to declare parameters named, meaning that they are passed as named arguments when the function is called. Named arguments do not have to be passed in the same order as the corresponding named parameters were declared. Named parameters can orthogonally be declared optional, having default expressions, and named rest parameters (binding to a
Dict). They also interact orthogonally with the below features. - Call-by-name arguments. Discussed more in the comment below. The corresponding operand (which is an expression) is passed unevaluated. Accessing the parameter causes the expression to be evaluated (in the environment of the caller). The resulting value is not memoized; each new variable access causes a new, separate evaluation.
- Lazy thunk arguments. The operand is passed unevaluated. The resulting value is memoized; after a result has been computed, it's saved and used as the result of subsequent variable accesses.
- In/out/inout modifiers. Marking a parameter as
@in(the default) means that the parameter passing is about the rvalue, and that assignments to the parameter are disallowed (and will fail at runtime if attempted). Marking a parameter as@outmeans that the parameter passing is about the lvalue; assignments to the parameter are allowed, and "write through" to the location referenced in the caller; variable accesses are not allowed. Marking a parameter as@inoutmeans that it's about both the rvalue and the lvalue; assignments write through to the location, and variable accesses give the rvalue. In the case of@outand@inout, passing an argument which is not an lvalue results in an error. - Parameter types. xxx
tbd
- Advice. xxx
- Multifunctions/generic functions. xxx
- Multimethods. xxx
To incorporate: this musing, which calls out the need for @cbn parameters in a language — and maybe a step beyond that is to have something like macros or operatives. HN discussion.
Call-by-value vs call-by-name
In a pure, effect-free world, there's no observable difference between CBV and CBN. This is something that Paul Blain Levy points out in his dissertation. CBV and CBN arises as two different evaluation strategies.
But I recently found a paper where the CBV/CBN distinction is explained quite vividly: Grokking the sequent calculus.
xxx
Macro expansion
tbd
The billion dollar mistake
tbd (well, I mean — not planning to (re-)do the mistake itself; planning to write this out later)
Lexing and parsing
tbd
xxx lexer-parser separation (vs scannerless), especially in the face of both lexer and parser being extensible
xxx Alma's general trend towards a a "pure" parse (away from Perl/Raku (!) and towards JavaScript, Dylan, Scheme) in which side effects are not fired when parsing e.g. a class
xxx: in particular, mention the problematic parsing of < when it's overloaded both for the less-than comparison operator, and for the start of a generic type argument
xxx the need for "abstract parsing" in quasis, and how that ties into LR parser generation
Types (and static vs dynamic)
xxx Alma being at heart a dynamic language, but (moreso than Raku) friendly to the static point of view
xxx gradual typing (?)
xxx pluggable types; never requiring types in a valid program (as do, for example, type classes)
xxx one expression language vs two (one for terms and one for types); see https://matklad.github.io/2025/08/09/zigs-lovely-syntax.html#Everything-Is-an-Expression
xxx https://gbracha.blogspot.com/2018/10/reified-generics-search-for-cure.html shows how to think about generic types from an "optional type system" perspective
Exceptions, next/last/redo, and effect handlers
xxx
Foreign Function Interface (FFI)
xxx https://verdagon.dev/blog/fearless-ffi
Whether everything is (or should be) an expression
xxx (no)
xxx https://craftinginterpreters.com/the-lox-language.html#design-note
Enums
xxx https://graydon2.dreamwidth.org/253769.html is a good post about sum types/discriminated unions -- takeaway, in my opinion, is that you win from pushing the idea of "can't access the wrong variant" into both the type system and the runtime
xxx TypeScript is the closest to what I imagine I'd want for Alma -- specifically the so-called "flow typing", which allows conditional branching to refine the types variables in blocks
xxx more exactly, I'd like for Alma to be both (untyped) JavaScript, in that you can go ahead and write any code you like and try to run it, and TypeScript, in that you can start adding type annotations and get (IDE/tooling) errors and completion
xxx there's a representation decision I simply have not made yet: whether enums should be based on (symbol-like) tags, or whether they should be more like subclasses in a closed hierarchy