carbon-lang Should some values be self-typed? If so, which ones?

There are a number of contexts in which we have suggested that certain values should have unique types. Moreover, there have been suggestions that we can define the type of certain values as being the same entity as value itself (notably, we say the type of the empty tuple () is the empty tuple ()). We could employ this lifting of values to the type level in general for values that should have unique types, forming in each case a type with exactly one value, where that value is the type itself.

The cases to which we might want to apply this include:

The empty tuple (and tuples whose elements are all on this list, such as ((), (), ()))
Numeric literals (and perhaps string literals?)
Functions

Examples:

// Tuples.
var () x = ();
fn f() -> () {
  return ();
}

// Numeric literals.
fn F(1: _) {}
fn Run() {
  F(1); // OK
  F(2); // type error
}
// OK, n is initialized with value 0 of type 0, which is in range.
// There is no possibility of overflow in such expressions.
var Int n = 123456789 * 123456789 - 123456789 * 123456789;

// Functions.
fn IsEven(Int n) -> Bool { return n mod 2 == 0; }
fn PrintIf(List(Int) list, (Type$$ Pred) predicate) {
  for (Int n : list) {
    if (predicate(n)) {
      Print(n);
    }
  }
}
fn Run() {
  List(Int) mylist = {1, 2, 3};
  // Specializes PrintIf for IsEven, no runtime dispatch
  PrintIf(mylist, IsEven);
}

Note that this has a potential foundational concern: repeatedly asking for the type of a value, and then that type's type, and so on, will not in general converge to Type, but I'm not sure that's a fundamental problem. (I think it's plausible that we'll want to model Type as an interface rather than as a type-type.)

So there are two questions here:

Do we want unique types for numeric literals and functions?
Do we want self-types for some or all of those cases?

Apr 30 '21 01:04 zygoloid

I'll ask the obvious three questions:

What are the arguments for numerical literals being self-typed? (Which I take to mean that the type of 12345 is 12345.)
Is there any well known general-purpose statically typed programming language in which numeric literals are self-typed?
Is there anything in the type theory research literature describing the theoretical properties of a type system in which numeric literals are self-typed?

(Same questions apply to functions, although the answers might not be the same. I'm asking about numbers first since they're simpler.)

I'm especially interested in question 2, just on the general grounds of innovation being scary.

Apr 30 '21 18:04 austern

There is a consistency argument. If we assume we want literals to have unique types, and we want the type of the () literal to be (), then should we apply the same self-type rule to all literals?
Is there any well-known general-purpose statically typed programming language in which types are first-class values? I think that's a prerequisite for what you're asking, and I can't think of one. The most similar thing I can think of is that in Lisp, constants evaluate to themselves, but that's not really all that similar. (In some dynamically typed languages, type is self-typed, but that also doesn't match your question along several axes.)
I've not found anything so far. I think the same question applies to () and Type, which I think so far we've been assuming are self-typed, but perhaps numeric literals introduce novel problems?

For what it's worth, assuming we want unique types for literals in the first place, I think it would be less surprising to most people if (eg) the type of 2 is some distinct value typeof(2) (however we spell that). While I'm not yet fully comfortable with saying that a tuple whose values are all types is itself a type, that seems to be less surprising, probably because we're building types out of types, even in the degenerate () case.

May 04 '21 19:05 zygoloid

I've been thinking about this a little bit, and I wonder if something a bit more ad hoc might be appropriate for integer literals. In particular, I would like the property that var i: auto = 17; would declare i to have type Int32 as long as the initializer is in range for an Int32. This approach is inspired by Zig.

The type of an integer literal is CInt for "compile-time integer".
CInt values are only allowed at compile-time, it is illegal to declare a variable with type CInt without a $.
Operations on CInt values would be resolved at compile time. The result of those operations would either be mathematically correct or give a compile error. No silent overflow.
CInt can represent integers in a very large range, at least enough to represent signed and unsigned 256-bit numbers.
CInt values could be converted to regular integer types as long as their value was in range. So var i: Int8 = 7; and var j: UInt32 = 3000000000; would be allowed.
var i: auto = __ where __ was a CInt value would only be legal if __ could be represented as an Int32, and give i the type Int32.

May 21 '21 16:05 josh11b

The name in the literature for self-typed things (i.e. a type inhabited by exactly one value) is "singleton types".

I'd be hesitant to use singleton types for number literals because they could make type error messages a bit surprising.

May 26 '21 17:05 jsiek

CInt values could be converted to regular integer types as long as their value was in range. So var i: Int8 = 7; and var j: UInt32 = 3000000000; would be allowed.

One potentially-significant problem with having different behavior for two values of the same type like this is that you can't correctly forward a CInt value through a wrapper function. This problem is evident and somewhat common in C++, where a 0 literal in a function argument doesn't get forwarded properly:

struct X { X(int*); };
auto p = std::make_unique<X>(0); // error, can't initialize int* from int

C++ solves this problem in this particular case by defining a uniquely-typed nullptr constant that can be properly forwarded. If we permit the validity and meaning of, say, a function call to depend not only on the type but also on the (constant) value of the argument, we will have the same problem on a wider scale.

May 26 '21 20:05 zygoloid

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.

Aug 25 '21 01:08 github-actions[bot]

I don't know if this is helpful, but I've always thought that literals should be though of in classical mathematical terms (Peano) as pure syntactical expressions. So 1 is just a shorthand for 'successor(zero)', you can do the same for all basic language constructs (think in terms of functional languages or term-rewriting languages such as Pure and Maude) with an applicative transform to a canonical representation (normalization). Then you have a mapping to the implementation from the canonical form. If you combine this with flow-typing you might get something interesting (i.e. you apply constraints through flow-typing, reducing the set of corresponding concrete implementation types and if there is more than one left in the end, maybe you can use a priority scheme).

Ole-Johan Dahl's 1990s introduction-level book "Verifiable Programming" shows this for basic programming language types as showed in this simple presentation without the verification aspect .

(Anyway, I am somewhat disappointed that "simplified" term-rewriting and flow-typing isn't used more in mainstream metaprogramming/generic programming. Even something limited, such as XQuery/XSLT could be quite potent.)

Jul 25 '22 07:07 OlaFosheimGrostad

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.

Nov 10 '22 02:11 github-actions[bot]

Proposal #2360 offers an answer to this question: only Type is self-typed, but some values may be converted to their type. So the type of () is not () but () as Type.

Nov 10 '22 16:11 josh11b

There is also the question: "Do we want unique types for numeric literals and functions?" #144 answers this question for numeric literals with "yes": the type of int literal N is IntLiteral(N), which has type Type. We have been applying this approach to other kinds of literals as well, for example RealLiteral(...) for floating-point types.

There last remaining question is whether functions have a unique type.

Nov 10 '22 16:11 josh11b

There is also the question: "Do we want unique types for numeric literals and functions?" #144 answers this question for numeric literals with "yes": the type of int literal N is IntLiteral(N), which has type Type. We have been applying this approach to other kinds of literals as well, for example RealLiteral(...) for floating-point types.

Agreed, in practice I think we've decided "yes" here, but we should confirm that.

There last remaining question is whether functions have a unique type.

I don't think we have an answer here, but I also don't think we need one in this issue. I'd suggest following up with a fresh question there once we have a better idea of how callable stuff should work.

Nov 11 '22 22:11 chandlerc

Confirmed with leads, and we're happy with the answers summarized here: https://github.com/carbon-language/carbon-lang/issues/508#issuecomment-1310532783 https://github.com/carbon-language/carbon-lang/issues/508#issuecomment-1310598321 https://github.com/carbon-language/carbon-lang/issues/508#issuecomment-1312222800

Nov 11 '22 23:11 chandlerc

carbon-lang carbon-lang copied to clipboard

Should some values be self-typed? If so, which ones?

carbon-lang
carbon-lang copied to clipboard