"Compile-time known" versus "statically known" versus "instantiation-time known"
I suggest distinguishing two kinds of "compile-time known values", namely "statically known values" and "instantiation-time known values", instead of coarsely saying "compile-time known values".
Phase distinction
In P4, there are actually three phases of evaluation: static evaluation, instantiation-time evaluation, and dynamic evaluation. Static evaluation happens when typechecking (which is a single pass). Instantiation-time evaluation happens when generating instances and going through the instance body to allocate resources. Dynamic evaluation happens when processing packets.
The first two phases both happen at compile time, so the values evaluated in both phases are literally compile-time known, but we need to be careful with the difference between them.
Examples
const bit<32> x = 1; /* statically known */
control MyC()(bit<32> v) { /* v is instantiation-time known */
const bit<32> y = 2; /* statically known */
const bit<32> c = v; /* instantiation-time known, because v is instantiation-time known */
apply {
}
}
Statically known values versus instantiation-time known values
It makes much sense to allow passing a constructor parameter (a instantiation-time known value) as constructor parameter of another constructor invocation. For example,
control C()(bit<32> v){ ... }
control MyC()(bit<32> v) {
C(v) c;
register<bit<8>>(v) reg;
apply {
}
}
But we don't want to allow using instantiation-time known values in typechecking. For example, this program should not be allowed:
control MyC()(bit<32> v) {
bit<32> x;
apply {
f(x[v:10]); /* This is not allowed, because we don't know the type of x[v:10] during typechecking. */
}
}
Problem with the current spec
- It's confusing that constructor parameters are not included in "compile-time known values".
- It looks useful to allow passing constructor parameters as constructor parameters.
Fix
We should clearly state the following in the spec:
- Constants without constructor parameters are statically known values.
- Constructor parameters and their derivatives are instantiation-time known values.
- Both statically known values and instantiation-time known values are compile-time known.
- Only statically known values can be used in typechecking.
- Constructor parameters and directionless parameters (excluding action data) only need to be compile-time known, including constructor parameters.
I realize there are actually four, instead of three, phases. To be precise, constructor parameters and directionless parameters need to be distinguished, too. For example,
/* This makes sense. */
control MyC()(bit<32> size) {
register<bit<8>>(size) reg;
apply {
}
}
/* This doesn't make sense. */
control MyC(bit<32> size)() {
register<bit<8>>(size) reg;
apply {
}
}
It's not clear in the spec about the requirements for directionless parameters those are neither constructor parameters nor action data. For my perspective, directionless parameters can be bound with directionless parameters and constructor parameters, but not runtime data. For example,
extern void log(string msg);
extern void log_data<T>(T data);
/* This might make sense. */
control MyC(string msg)() {
apply {
if ( ... ) {
log(msg);
}
}
}
/* This doesn't make sense. */
control MyC(inout bit<32> x)() {
apply {
if ( ... ) {
log(x);
}
}
}
Directionless parameters should have been relegated to constructor parameters only. The reason we didn't do that is that constructor parameters are weird, and very seldom used, and there is at least one case where we always need a directionless parameter -- the packetin parameter for the parsers. So we wanted parser writing to be nicer and not require two sets of parantheses:
parser p(inout headers h)(packetin packet) {
...
}
The spec does not say how typechecking is implemented. I don't think we want to tie our hands by being overly specific. So I see no need to indicate what kinds of information can or cannot be used for typechecking. That is really an implementation and not a definition choice. That being said, I won't object to making the evaluation section from the spec clearer.
I agree. Let's see if we tighten up the spec, but perhaps not go as far as mandating N phases, for large N...
What I'm thinking about is that we need to specify which programs typecheck and which programs don't. Maybe that's not true.
A related example to this is:
extern void f(int t);
void g(bit t)
{
f(t+0);
}
Right now the reference compiler accepts the above when "+0" is removed but the above is a +0 is a nop. Casts are accepted though.
extern void f(int t);
void g(bit t)
{
f((bit<3>)(int<3>)(int<1>)(bit<1>)t);
}
Which seems like the reference compiler is looking through casts but not other expressions.
I believe there may be a separate issue. Perhaps @mbudiu-vmw can weigh in: are functions not being type checked as they are declared, but only after being inlined?
I think this is subsumed by #1213.