basil
basil copied to clipboard
Add raw pointers and pointer types.
Basil compiles directly to native code, so it should be relatively straightforward to support unsafe pointer types and pointer arithmetic, at least from a codegen perspective. While the primary means of passing around reference types should be through garbage-collected pointers, supporting raw pointers would allow Basil to more easily express low-level function prototypes - useful for interacting with foreign functions.
I propose we introduce a new primitive type kind, "Pointer", that has a single parameter - the type it points to. We'll tentatively denote the type of a pointer to some type T
as T ptr
. Pointer types can be coerced generically to pointer types that point to a generic type like Any
or a type variable. Besides this, pointer types support no other implicit coercions.
Pointers should, at minimum, support a few primary operations:
-
deref : T? ptr -> T?
: dereferences a pointer and returns the value it points to. -
addr : T? -> T? ptr
: computes the address of a value and returns it as a pointer value. The parameter toaddr
must be an lvalue! -
T? ptr as U? ptr
: converts one pointer type to another. This is unchecked - unsafe pointer coercion is not a runtime or compile error! -
T? ptr as Int
: converts a pointer to an integer value. -
Int as T? ptr
: converts an integer to a pointer value. Between this and the previous conversion, rudimentary pointer arithmetic is achievable.
A few open questions:
- Should we introduce new syntax for dereference and address-of operations? We could replicate C-style
&val
and*ptr
syntax with a few new tokens. One less-invasive alternative would be Zig-style dot syntax:val.&
andptr.*
would be easily expressed in the current Basil semantics. - Perhaps we could add some easier pointer arithmetic instructions than converting to and from
Int
? Maybe it could be type-based:ptr + Int
could add the size of anInt
to the address contained inptr
.
Generally references (i.e. disguised pointers but with less free semantics) are a necessity for a general purpose language. I think pointers (i.e. the free semantics as pointer arithmetic etc.) should definitely be disallowed by default like Rust, V, and other modern langs do. And only allowed in some sort of unsafe { }
block or other visually explicit denotation.
A few open questions:
- Should we introduce new syntax for dereference and address-of operations? We could replicate C-style
&val
and*ptr
syntax with a few new tokens. One less-invasive alternative would be Zig-style dot syntax:val.&
andptr.*
would be easily expressed in the current Basil semantics.
By default (say outside of unsafe { }
blocks) a reference (pointer) shall be fully indistinguishable from a non-reference (non-pointer) value. Many newer as well as older/traditional languages have proven that it's really unnecessary to make it explicit because safe built-in statements/operations behave on the surface the same as with non-ref values. And it seems making it explicit (that we want to deal with a ref) just on one place (e.g. during function argument definition) is more than enough.
- Perhaps we could add some easier pointer arithmetic instructions than converting to and from
Int
? Maybe it could be type-based:ptr + Int
could add the size of anInt
to the address contained inptr
.
Yep, why not. But only in the unsafe { }
block. Otherwise compile-time error :wink:.
I'm kind of morally opposed to unsafe
as a language feature - if we add a perfectly functional feature that is often the best solution to a problem, why actively discourage its use? I don't think it fits Basil's theme of flexibility to strike down useful features as "undesirable"...and deal with that in no way other than to make that feature more annoying. It's a little more justifiable in Rust due to their static analysis, but Basil is garbage collected! So we don't need to limit ourselves in order to get memory-safe allocations.
If it wasn't clear though, the predominant approach towards reference semantics and memory management will be through safe, garbage-collected reference types - I've created a new issue for those, and the intent is that they'll be the recommended kind of pointer type for most workloads.
Ah, ok. This reminds me of Nim's references.
Now it's clear that it should be easy to judge about the source code whether we're dealing with "dangerous pointers" or "safe pointers". That could be enough for me as a linter or some compiler option (akin to -Werror
-Wall
-Wextra
or perhaps -Wpointer
) could be made to fail compilation of sources with raw pointers :wink:.