cpp2
cpp2 copied to clipboard
Overview of Herb Sutter's Cpp2 & cppfront
Cpp2 Language Overview
Disclaimer:
- These docs are unofficial and may be inaccurate or incomplete.
- Please file bugs at https://github.com/ntrel/cpp2/issues.
- At the time of writing, Cpp2 is an unstable experimental language, see:
- https://github.com/hsutter/cppfront#cppfront
- https://github.com/hsutter/cppfront#wheres-the-documentation.
Note: Some examples are snipped/adapted from: https://github.com/hsutter/cppfront/tree/main/regression-tests
Note: Examples here use C++23 std::println instead of std::cout.
If you don't have it, you can use this definition:
std: namespace = {
println: (args...) = (std::cout << ... << args) << "\n";
}
Contents
- Declarations
- Variables
- Modules
- Types
- Memory Safety
- Expressions
- Statements
- Functions
- User-Defined Types
- Templates
- Aliases
Declarations
These are of the form:
- declaration:
- identifier
:type?=initializer
- identifier
type can be omitted for type inference (though not at global scope).
x: int = 42;
y := x;
A global declaration can be used before the line declaring it.
Mixing Cpp1 Declarations
Cpp1 declarations can be mixed in the same file.
// Cpp2
x: int = 42;
// Cpp1
int main() {
return x; // use a Cpp2 definition
}
A Cpp2 declaration cannot use Cpp1 declaration format internally:
// declare a function
f: () = {
int x; // error
}
Note: cppfront has a -p switch to only allow pure Cpp2.
Variables
Uninitialized Variables
Use of an uninitialized variable is statically detected.
When the variable declaration specifies the type, initialization can be
deferred to a later statement.
Both branches of an if statement must
initialize a variable, or neither.
x: int;
y := x; // error, x is uninitialized
if f() {
x = 1; // initialization, not assignment
} else {
x = 0; // initialization required here too, otherwise an error
}
x = 2; // assignment
Runtime Constants
x: const int;
x = 5; // initialization
x = 6; // error
y: int = 7;
z: const _ = y; // z is a `const int`
Note that x does not need to be initialized immediately, it can deferred.
This is particularly useful when using if branches to initialize the
constant.
https://github.com/ntrel/cppfront/wiki/Design-note:-const-objects-by-default
Implicit Move on Last Use
A variable is implicitly moved on its last use when the use site syntax may accept an rvalue. This includes passing an argument to a function, but not an assignment to the last use of a variable.
inc: (inout v: int) = v++;
test2: () = {
v := 42;
inc(v); // OK, lvalue
inc(v); // error, cannot pass rvalue
}
This can be suppressed by adding a statement _ = v; after the final inc call.
Modules
Cpp2 files have the file extensions .cpp2 and .h2.
Imports
C++23 will support:
import std;
This will be implicitly done in Cpp2. For now common std headers are imported.
Types
See also: User-Defined Types.
Arrays
Use:
std::arrayfor fixed-size arrays.std::vectorfor dynamic arrays.std::spanto reference consecutive elements from either.
Pointers
A pointer to T has type *T. Pointer arithmetic is illegal.
Postfix Pointer Operators
Address of and dereference operators are postfix:
x: int = 42;
p: *int = x&;
y := p*;
This makes p-> obsolete - use p*. instead.
To distinguish these from binary & and *, use preceeding whitespace.
new<T>
new<T> gives unique_ptr by default:
p: std::unique_ptr<int> = new<int>;
q: std::shared_ptr<int> = shared.new<int>;
Note: gc.new<T> will allocate from a garbage collected arena.
There is no delete operator. Raw pointers cannot own memory.
Null Dereferences
Initialization or assignment from null is an error:
q: *int = nullptr; // error
Instead of using null for *T, use std::optional<*T>.
By default, cppfront also detects a runtime null dereference.
For example when dereferencing a pointer created in Cpp1 code.
int *ptr;
f: () -> int = ptr*;
Calling f above produces:
Null safety violation: dynamic null dereference attempt detected
Memory Safety
Cpp2 will not enforce a memory-safety subset 100%. It will diagnose or prevent type, bounds, initialization, and common lifetime memory-safety violations. This is done by:
- Runtime bounds checks
- Requiring each variable is initialized before use in every possible branch
- Not implemented yet: Compile-time tracking of a set of 'points-to' information for each pointer. When a pointed-to variable goes out of scope, the set is updated to replace the variable with an invalid item. Dereferencing a pointer with a set containing an invalid item is a compile-time error. See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1179r1.pdf.
See:
- https://github.com/hsutter/cppfront#2015-lifetime-safety
- https://www.reddit.com/r/cpp/comments/16ummo8/cppfront_autumn_update/k2r3fto/
Bounds Checks
By default, cppfront does runtime bound checks when indexing:
v: std::vector = (1, 2);
i := v[-1]; // aborts program
s: std::string = ("hi");
i = s[2]; // aborts program
Expressions
Postfix Operators
Besides the pointer operators, Cpp2 also only uses postfix instead of prefix form for:
++--~
Unlike Cpp1, the immediate result of postfix increment/decrement is the new value.
i := 0;
assert(i++ == 1);
https://github.com/hsutter/cppfront/wiki/Design-note:-Postfix-operators
String Interpolation
A bracketed expression with a trailing $ inside a string will
evaluate the expression, convert it to string and insert it into the
string.
a := 2;
b: std::optional<int> = 2;
s: std::string = "a^2 + b = (a * a + b.value())$\n";
assert(s == "a^2 + b = 6\n");
Note: $ means 'capture' and is also used in closures
and postconditions:
https://github.com/hsutter/cppfront/wiki/Design-note%3A-Capture
Anonymous Variables
- anonymousVariable:
:type?=expression
f: (i: int) = { std::println("int"); }
f: (i: short) = { std::println("short"); }
main: () = {
f(5); // int
f(:short = 5); // short
}
The last statement is equivalent to tmp: short = 5; f(tmp);.
Identifier Expressions
- identifierExpression:
- identifier
- identifier
<expressions> - expression
::identifierExpression
Required Parentheses
Whenever any kind of identifier expression is used where it could parse as a type, it must be enclosed in parentheses:
id1- type(id1)- expression
An identifier expression does not need parentheses where a type would not be valid. Other expressions never need parentheses as they could not be parsed as a valid type, e.g. literals, unary expressions etc.
as
- asExpression:
- expression
astype
- expression
x as T attempts:
- type conversion (if the type of
ximplicitly converts toT) - customized conversion (using
operator as<T>), useful forstd::optional,std::variantetc. - construction of
T(x) - dynamic casting (equivalent to Cpp1
dynamic_cast<T>(x)whenxis a base class ofT)
An exception is thrown if the expression is well-formed but the conversion is invalid.
c := 'A';
i: int = c as int;
assert(i == 65);
v := std::any(5);
i = v as int;
s := "hi" as std::string;
assert(s.length() == 2);
is
- isExpression:
- type
is(type | template) - expression
is(type | expression | template)
- type
Type Tests
Not implemented yet.
Test a type T matches another type - T is Target attempts:
truewhenTis the same type asTarget.trueifTis a type that inherits fromTarget.
Test a type against a template - T is Template attempts:
trueifTis an instance ofTemplate.Template<T>if the result is convertible tobool.
Expression Tests
Note: Testing an identifier expression needs to use parentheses.
Test type of an expression - (x) is T attempts:
truewhen the type ofxisTx.operator is<T>()(x) is voidmeansxis empty
assert(5 is int);
i := 5;
assert((i) is int);
assert(!((i) is long));
v := std::any();
assert((v) is void); // `v.operator is<void>()`
v = 5;
assert((v) is int); // `v.operator is<int>()`
Test expression has a particular value - (x) is v attempts:
x.operator is(v)x == vx as V == vwhereVis the type ofvv(x)if the result isbool
i := 5;
assert((i) is 5);
v := std::any(i);
assert((v) is 5);
The last lowering allows to test a value by calling a predicate function:
pred: (x: int) -> bool = x < 20;
test_int: (i: int) = {
if (i) is (pred) {
std::println("(i)$ is less than 20");
}
}
main: () = {
test_int(5);
test_int(15);
test_int(25);
}
Note that pred is not a type identifier so it must be parenthesized.
Test an expression against a template - (x) is Template attempts:
trueif the type ofxis an instance ofTemplate.Template<(x)>if the result is convertible tobool.
inspect
- inspectExpression:
inspectconstexpr? expression->type{alternative+}
- alternative:
- alt-name? pattern
=statement - alt-name? pattern
{alternative+}
- alt-name? pattern
- alt-name:
- identifier
:
- identifier
- pattern:
is(type | expression | template)astypeifexpression- pattern
||pattern - pattern
&&pattern
Only is alternatives without alt-name are implemented ATM.
v : std::any = 12;
main: () = {
s: std::string;
s = inspect v -> std::string {
is 5 = "five";
is int = "some other integer";
is _ = "not an integer";
};
std::println(s);
}
An inspect expression must have an is _ case.
Unimplemented: an inspect statement has the same grammar except
there must be no -> type after the expression.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2392r2.pdf
Move Expressions
A variable can be explictly moved. The move constructor of z will destroy x:
x: std::string = "hi";
z := (move x);
assert(z == "hi");
assert(x == "");
See also Implicit Move on Last Use.
Statements
A condition expression does not require parentheses in Cpp2, though when a statement immediately follows a condition, a blockStatement is required.
if
- ifStatement:
ifconstexpr? expression blockStatement elseClause?
- elseClause:
elseblockStatementelseifStatement
if c1 {
...
} else if c2 {
...
} else {
...
}
Assertions
x := 1
assert(x == 1);
Parameterized Statement
- parameterizedStatement:
- parameterList statement
A parameterized statement declares one or more variables that are defined only for the scope of statement.
(tmp := some_complex_expression) func(tmp, tmp);
// tmp no longer in scope
Valid parameterStorage keywords are in, copy, inout.
while
- whileStatement:
whileexpression nextClause? blockStatement
- nextClause:
nextexpression
If next is present, its expression will be evaluated at the
end of each loop iteration.
// prints: 0 1 2
(copy i := 0) while i < 3 next i++ {
std::println(i);
}
Note: The above is a parameterizedStatement.
do
- doWhileStatement:
doblockStatement nextClause?whileexpression;
// prints: 0 1 2
i := 0;
do {
std::println(i);
} next i++ while i < 3;
for
- forStatement:
forexpression nextClause?do(parameter)statement
The first expression must be a range.
parameter is initialized from each element of the
range. The parameter type is inferred.
parameter can have inout parameterStorage.
vec: std::vector<int> = (1, 2, 3);
for vec do (inout e)
e++;
assert(vec[0] == 2);
for vec do (e)
std::println(e);
Labelled break and continue
The target of these statements can be a labelled loop.
outer: while true {
j := 0;
while j < 3 next j++ {
if done() {
break outer;
}
}
}
Functions
Function Types
- functionType:
- parameterList returnSpec
- parameterList:
(parameter?)(parameter (,parameter)+)
- parameter:
- parameterStorage? type.
- parameterStorage? identifier
...?:type.
- returnSpec:
->(forward|move)? type->parameterList
E.g. (int, float) -> bool.
Function Declarations
- functionDeclaration:
- identifier?
:parameterList returnSpec?; - identifier?
:parameterList returnSpec? contracts?=functionInitializer - identifier?
:parameterList expression;
- identifier?
Function declarations extend the declaration form. Each parameter must have an identifier.
If returnSpec is missing with the first two forms, the function returns void.
The return type can be inferred from the initializer by using -> _.
See also Template Functions.
Function Bodies
- functionInitializer:
- (expression
;| statement)
- (expression
A function is initialized from a statement or an expression.
d: (i: int) = std::println(i);
e: (i: int) = { std::println(i); } // same
If the function has a returnSpec, the expression form implies a return statement.
f: (i: int) -> int = return i;
g: (i: int) -> int = i; // same
Lastly, -> _ = together can be omitted:
h: (i: int) i; // same as f and g
This form is useful for lambda functions.
Named Return Values
When a function returns a parameterList, each parameter must be named. A function with multiple named return parameters returns a struct with a member for each parameter.
f: () -> (i: int, s: std::string) = {
i = 10;
s = "hi";
}
main: () = {
t := f();
assert(t.i == 5);
assert(t.s == "hi");
}
- Unless a return parameter has a default value, it must be initialized in the function body.
- When only one return parameter is declared, the caller does not use member syntax to access the result.
f: () -> (ret: int = 42) = {}
main: () = {
assert(f() == 42);
}
main
- mainFunction:
main:(args?)(->int)?=functionInitializer
If args is declared, it is a std::vector<std::string_view> containing
each command-line argument to the program.
Uniform Call Syntax
If a method doesn't exist when using method call syntax, and there is a function whose first parameter can take the type of the 'object' expression, then that function is called instead.
main: () -> int = {
// call C functions
myfile := fopen("xyzzy", "w");
myfile.fprintf("Hello %d!", 2); // fprintf(myfile, "Hello %d!", 2)
myfile.fclose(); // fclose(myfile)
}
Parameter Passing
in- default, read-only. Will pass by reference when more efficient, otherwise pass by value.inout- pass by mutable reference.out- must be written to. Can accept an uninitialized argument, otherwise destroys the argument. The first assignment constructs the parameter. Used for constructors.move- argument can be moved from. Used for destructors.copy- argument can be copied from.forward- accepts lvalue or rvalue, pass by reference.
e: (i: int) = i++; // error, `i` is read-only
f: (inout i: int) = i++; // mutate argument
g: (out i: int) = {
v := i; // error, `i` used before initialization
// error, `i` was not initialized
}
Functions can return by reference:
first: (forward v: std::vector<int>) -> forward int = v[0];
main: () -> int = {
v : std::vector = (1,2,3);
first(v) = 4;
}
https://github.com/hsutter/cppfront/blob/main/regression-tests/mixed-parameter-passing.cpp2
Contracts
vec: std::vector<int> = ();
insert_at: (where: int, val: int)
pre(0 <= where && where <= vec.ssize())
post(vec.ssize() == vec.ssize()$ + 1) = {
vec.insert(vec.begin() + where, val);
}
The postcondition compares the vector size at the end of the function call with an expression that captures the vector size at the start of the function call.
A single named return is useful to refer to a result in a postcondition:
f: () -> (ret: int)
post(ret > 0) = {
ret = 42;
}
Function Literals
A function literal is declared like a named function, but omitting the leading identifier. Variables can be captured:
s: std::string = "Got: ";
f := :(x) = { std::println(s$, x); };
f(5);
f("str");
s$means capturesby value.s&$*can be used to dereference the captured address ofs.
Template Functions
A template function declaration can have template parameters:
- functionTemplate:
- identifier?
:templateParameterList? parameterList returnSpec? requiresClause?
- identifier?
E.g. size: <T> (v: T) -> _ = v.length();
When a function parameter type is _, this implies a template with a
corresponding type parameter.
A template function parameter can also be just identifier.
f: (x: _) = {}
g: (x) = {} // same
Variadic Template Functions
print: (a0) = std::print(a0);
print: (a0, args...) = {
print(a0);
print(", ");
print(args...);
}
main: () = print(1, 2, 3);
User-Defined Types
type declares a user-defined type with data members and member functions.
When the first parameter is this, it is an instance method.
myclass : type = {
data: int = 42;
more: std::string = std::to_string(42);
// method
print: (this) = {
std::println("data: (data)$, more: (more)$");
}
// non-const method
inc: (inout this) = data++;
}
main: () = {
x: myclass = ();
x.print();
x.inc();
x.print();
}
Data members are private by default, whereas methods are public.
Member declarations can be prefixed with private or public.
operator=
Official docs: https://github.com/hsutter/cppfront/wiki/Cpp2:-operator=,-this-&-that.
operator= with an out this first parameter is called for construction.
When only one subsequent parameter is declared, assignment will also
call this function.
operator=: (out this, i: int) = {
this.data = i;
}
...
x: myclass = 99;
x = 1;
With only one parameter move this, it is called to destroy the object:
operator=: (move this) = {
std::println("destroying (data)$ and (more)$");
}
Objects are destroyed on last use, not end of scope.
Inheritance
base: type = {
operator=: (out this, i: int) = {}
}
derived: type = {
this: base = (5); // declare parent class & construct with `base(5)`
}
Type Templates
- typeTemplate:
- identifier?
:templateParameterList?typerequiresClause?
- identifier?
Templates
- templateParameterList:
<templateParameters>
- templateParameter
- identifier
...? (:type)? - identifier
:type
- identifier
The first parameter form accepts a type.
The second parameter form accepts a value. To use a constant identifier as a template parameter, enclose it in parentheses:
f: <i: int> () -> _ = i;
n: int == 5;
...
std::println(f<(n)>());
n is a constant alias.
Constraints
- requiresClause:
requiresconstExpression
defaultValue: <T> () -> T requires std::regular<T> = { v: T = (); return v; }
...
assert(defaultValue<int>() == 0);
Note: Using an inline concept for a type parameter is not supported yet.
Concepts
- concept:
- identifier
:templateParameterListconceptrequiresClause?=constExpression;
- identifier
arithmetic: <T> concept = std::integral<T> || std::floating_point<T>;
...
assert(arithmetic<i32>);
assert(arithmetic<float>);
Aliases
Aliases are defined using == rather than =.
- alias:
- identifier
:templateParameterList? type?==constExpression - identifier
:templateParameterList? functionType==functionInitializer - identifier
:templateParameterList?type==type - identifier
:namespace==identifierExpression
- identifier
The forms above are equivalent to the following Cpp1 declarations:
constexprvariableconstexprfunctionusingtype aliasnamespacealias
// constant template
size: <T> size_t == sizeof(T);
// compile-time function
init: <T> () -> T == ();
main: () = {
static_assert(size<char> == 1);
// constant aliases
v := 5;
//n :== v; // error, cannot read `v` at compile-time
n :== 6; // OK
myfunc :== main;
static_assert(init<int>() == 0);
view: type == std::string_view;
N4: namespace == std::literals;
}