carbon-lang
carbon-lang copied to clipboard
Should some values be self-typed? If so, which ones?
There are a number of contexts in which we have suggested that certain values should have unique types. Moreover, there have been suggestions that we can define the type of certain values as being the same entity as value itself (notably, we say the type of the empty tuple ()
is the empty tuple ()
). We could employ this lifting of values to the type level in general for values that should have unique types, forming in each case a type with exactly one value, where that value is the type itself.
The cases to which we might want to apply this include:
- The empty tuple (and tuples whose elements are all on this list, such as
((), (), ())
) - Numeric literals (and perhaps string literals?)
- Functions
Examples:
// Tuples.
var () x = ();
fn f() -> () {
return ();
}
// Numeric literals.
fn F(1: _) {}
fn Run() {
F(1); // OK
F(2); // type error
}
// OK, n is initialized with value 0 of type 0, which is in range.
// There is no possibility of overflow in such expressions.
var Int n = 123456789 * 123456789 - 123456789 * 123456789;
// Functions.
fn IsEven(Int n) -> Bool { return n mod 2 == 0; }
fn PrintIf(List(Int) list, (Type$$ Pred) predicate) {
for (Int n : list) {
if (predicate(n)) {
Print(n);
}
}
}
fn Run() {
List(Int) mylist = {1, 2, 3};
// Specializes PrintIf for IsEven, no runtime dispatch
PrintIf(mylist, IsEven);
}
Note that this has a potential foundational concern: repeatedly asking for the type of a value, and then that type's type, and so on, will not in general converge to Type
, but I'm not sure that's a fundamental problem. (I think it's plausible that we'll want to model Type
as an interface rather than as a type-type.)
So there are two questions here:
- Do we want unique types for numeric literals and functions?
- Do we want self-types for some or all of those cases?
I'll ask the obvious three questions:
- What are the arguments for numerical literals being self-typed? (Which I take to mean that the type of 12345 is 12345.)
- Is there any well known general-purpose statically typed programming language in which numeric literals are self-typed?
- Is there anything in the type theory research literature describing the theoretical properties of a type system in which numeric literals are self-typed?
(Same questions apply to functions, although the answers might not be the same. I'm asking about numbers first since they're simpler.)
I'm especially interested in question 2, just on the general grounds of innovation being scary.
- There is a consistency argument. If we assume we want literals to have unique types, and we want the type of the
()
literal to be()
, then should we apply the same self-type rule to all literals? - Is there any well-known general-purpose statically typed programming language in which types are first-class values? I think that's a prerequisite for what you're asking, and I can't think of one. The most similar thing I can think of is that in Lisp, constants evaluate to themselves, but that's not really all that similar. (In some dynamically typed languages,
type
is self-typed, but that also doesn't match your question along several axes.) - I've not found anything so far. I think the same question applies to
()
andType
, which I think so far we've been assuming are self-typed, but perhaps numeric literals introduce novel problems?
For what it's worth, assuming we want unique types for literals in the first place, I think it would be less surprising to most people if (eg) the type of 2
is some distinct value typeof(2)
(however we spell that). While I'm not yet fully comfortable with saying that a tuple whose values are all types is itself a type, that seems to be less surprising, probably because we're building types out of types, even in the degenerate ()
case.
I've been thinking about this a little bit, and I wonder if something a bit more ad hoc might be appropriate for integer literals. In particular, I would like the property that var i: auto = 17;
would declare i
to have type Int32
as long as the initializer is in range for an Int32
. This approach is inspired by Zig.
- The type of an integer literal is
CInt
for "compile-time integer". -
CInt
values are only allowed at compile-time, it is illegal to declare a variable with typeCInt
without a$
. - Operations on
CInt
values would be resolved at compile time. The result of those operations would either be mathematically correct or give a compile error. No silent overflow. -
CInt
can represent integers in a very large range, at least enough to represent signed and unsigned 256-bit numbers. -
CInt
values could be converted to regular integer types as long as their value was in range. Sovar i: Int8 = 7;
andvar j: UInt32 = 3000000000;
would be allowed. -
var i: auto = __
where__
was aCInt
value would only be legal if__
could be represented as anInt32
, and givei
the typeInt32
.
The name in the literature for self-typed things (i.e. a type inhabited by exactly one value) is "singleton types".
I'd be hesitant to use singleton types for number literals because they could make type error messages a bit surprising.
CInt
values could be converted to regular integer types as long as their value was in range. Sovar i: Int8 = 7;
andvar j: UInt32 = 3000000000;
would be allowed.
One potentially-significant problem with having different behavior for two values of the same type like this is that you can't correctly forward a CInt
value through a wrapper function. This problem is evident and somewhat common in C++, where a 0
literal in a function argument doesn't get forwarded properly:
struct X { X(int*); };
auto p = std::make_unique<X>(0); // error, can't initialize int* from int
C++ solves this problem in this particular case by defining a uniquely-typed nullptr
constant that can be properly forwarded. If we permit the validity and meaning of, say, a function call to depend not only on the type but also on the (constant) value of the argument, we will have the same problem on a wider scale.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive
label. The long term
label can also be added for issues which are expected to take time.
This issue is labeled inactive
because the last activity was over 90 days ago.
I don't know if this is helpful, but I've always thought that literals should be though of in classical mathematical terms (Peano) as pure syntactical expressions. So 1 is just a shorthand for 'successor(zero)', you can do the same for all basic language constructs (think in terms of functional languages or term-rewriting languages such as Pure and Maude) with an applicative transform to a canonical representation (normalization). Then you have a mapping to the implementation from the canonical form. If you combine this with flow-typing you might get something interesting (i.e. you apply constraints through flow-typing, reducing the set of corresponding concrete implementation types and if there is more than one left in the end, maybe you can use a priority scheme).
Ole-Johan Dahl's 1990s introduction-level book "Verifiable Programming" shows this for basic programming language types as showed in this simple presentation without the verification aspect .
(Anyway, I am somewhat disappointed that "simplified" term-rewriting and flow-typing isn't used more in mainstream metaprogramming/generic programming. Even something limited, such as XQuery/XSLT could be quite potent.)
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive
label. The long term
label can also be added for issues which are expected to take time.
This issue is labeled inactive
because the last activity was over 90 days ago.
Proposal #2360 offers an answer to this question: only Type
is self-typed, but some values may be converted to their type. So the type of ()
is not ()
but () as Type
.
There is also the question: "Do we want unique types for numeric literals and functions?"
#144 answers this question for numeric literals with "yes": the type of int literal N
is IntLiteral(N)
, which has type Type
. We have been applying this approach to other kinds of literals as well, for example RealLiteral(...)
for floating-point types.
There last remaining question is whether functions have a unique type.
There is also the question: "Do we want unique types for numeric literals and functions?" #144 answers this question for numeric literals with "yes": the type of int literal
N
isIntLiteral(N)
, which has typeType
. We have been applying this approach to other kinds of literals as well, for exampleRealLiteral(...)
for floating-point types.
Agreed, in practice I think we've decided "yes" here, but we should confirm that.
There last remaining question is whether functions have a unique type.
I don't think we have an answer here, but I also don't think we need one in this issue. I'd suggest following up with a fresh question there once we have a better idea of how callable stuff should work.
Confirmed with leads, and we're happy with the answers summarized here: https://github.com/carbon-language/carbon-lang/issues/508#issuecomment-1310532783 https://github.com/carbon-language/carbon-lang/issues/508#issuecomment-1310598321 https://github.com/carbon-language/carbon-lang/issues/508#issuecomment-1312222800