Literal syntax to represent the value of "an invalid header"
Personnel
- [x] Owner: @jafingerhut
- [x] Supporters: @mbudiu-vmw
Design
- [x] Document: See rest of this comment below
Implementation
- [x]
p4-spec: https://github.com/p4lang/p4-spec/pull/1184 - [x]
p4c: https://github.com/p4lang/p4c/pull/3667
Process
- [x] LDWG discussed: April 2021
- [ ] LDWG approved:
- [ ] Merged into
p4-spec: - [x] Merged into
p4c: https://github.com/p4lang/p4c/pull/3667 (as experimental PR)
=======================================
On 2022-Nov-10 this experimental change to p4c was merged into its code base, making the syntax (H) {#} mean "the value that is an invalid header with type H", where H must be the name of a header type:
I am not sure, but I think that an answer to this question might be blocking the completion of an implementation of structure-valued expressions, which has been part of the language spec since May 2021, but has no implementation yet in p4c. It has only a PR with a partial implementation here: https://github.com/p4lang/p4c/pull/2368
Related issue that raised this question in 2021-Apr: https://github.com/p4lang/p4-spec/issues/341 but this detailed part of it on a literal syntax to mean "a value that is an invalid header" remains unresolved so far.
See these sections of the spec:
- "Structure-valued expressions" https://p4.org/p4-spec/docs/P4-16-v1.2.2.html#sec-structure-expressions (Section 8.12 in version 1.2.2)
- "Initializing with default values" https://p4.org/p4-spec/docs/P4-16-v1.2.2.html#sec-initializing-with-default-values (Section 8.23 in version 1.2.2)
(note that these links are to version 1.2.2 of the spec, not the latest main revision, so look for the sections with these names in the latest main revision to be completely up-to-date, in case they have changed)
Notes:
The section "Structure-valued expressions" says: "For a structure-valued expression typeRef is the name of a struct or header type." It does not explicitly say whether the resulting header value is valid or invalid, but the intent was probably that a structure-valued expression with a header type always represents a header value that is valid. If that is correct, it might be nice to have a phrase that says so explicitly in that section.
Aside, for which a separate issue has been created here (https://github.com/p4lang/p4-spec/issues/1032): Should structure-valued expressions allow ... inside of them?
These examples are given for initializing headers in the "Initializing with default values" section:
H h1 = ...; // initialize h1 with a header that is invalid
H h2 = { f2=5, ... }; // initialize h2 with a header that is valid, field f1 0, field f2 5
H h3 = { ... }; // initialize h3 with a header that is valid, field f1 0, field f2 0
I believe Mihai has expressed an interest in being able to transform source code like H h1 = ...; into some more explicit low-level form where the ... is replaced with a literal expression denoting an invalid header, I believe because ... is really intended to be syntactic sugar for the default value of any type. I am not sure, but perhaps one reason for wanting a literal that means only "an invalid header" and nothing else, is that it would enable a pass of p4c to transform code like this:
header h1_t { /* header fields here */ }
struct s1_t {
h1_t h1;
bit<8> f2;
}
// later
s1_t x = { ... };
into this (for this example, assume the syntax !{} is the literal chosen to mean "an invalid header"):
s1_t x = {!{}, 0};
Note that ... is NOT very useful for this purpose, because it would transform the original code into this:
s1_t x = {..., 0};
which is illegal according to the "initializing with default values" section of the spec, because ... must be last in a list expression. Perhaps the compiler could instead transform it into this:
s1_t x = {h1=..., f2=0};
but it is not clear to me whether that is allowed or precisely defined in the spec today.
Possible expressions:
invalidorinvalid_headeror some other new reserved keyword in the P4_16 language spec (Mihai proposed this in an older issue)- Disadvantage: I believe a name like this would become a name that could conflict with names of things like tables, extern object instances, etc. (all other unqualified identifiers in P4), which is less severe of an issue the longer the keyword is.
!or some other short symbol (Ben Pfaff proposed this in an older issue) +!{},{!},~{},{~}- some other short sequences of symbols{}- I would recommend NO for the following reason: p4c today lets you initialize a header variable whose type is a header with 0 fields in it, to the expression{}, and it is initialized as a valid header. That behavior is consistent with a header becoming valid when initialized with an N-element list expression and the header has N fields of corresponding types. The spec does explicitly allow header types with 0 fields in them (see https://p4.org/p4-spec/docs/P4-16-v1.2.2.html#sec-header-types).{.valid=false}or perhaps some other sequence of characters in place of.validthere. The main point is that it is impossible for that sequence of characters to collide with a user-defined header field name.- Not clear whether
.validwould conflict with possible uses of hierarchical names defined elsewhere in spec.
- Not clear whether
header_type_name.invalidHeader(), whereheader_type_nameis the name of a header type. This even builds in the type of the invalid header into the expression, although that could be considered a disadvantage if you want a syntax that can represent an invalid header of an arbitrary header type.- Advantage: header type is always explicit
- Disadvantage: Fairly large number of characters if you want to write it.
Whatever literal syntax we devise for this, I am guessing that like structure-valued expressions, the header type can be omitted if it can be inferred from context, or it can be given explicitly via (typeRef) just before it.
With this version of p4c source code, latest as of 2022-Feb-26:
commit 7988ea2b700266a419a751e3f5e2297c5dd7902d
Author: Mihai Budiu <[email protected]>
Date: Sat Feb 26 19:13:16 2022 -0800
If you declare a header type with 0 fields, and initialize a header variable of that type to {}, it is initialized as a valid header. This seems to me consistent with initializing a header with N fields to a list expression that gives explicit values for all N fields, which also initializes that header as valid.
If you declare a header type with N >= 1 fields, an attempt to initialize it with an empty list expression, you get an error like this, which seems reasonable to me. You get a similar message if you use a list expression with a different number of elements than the number of fields in the header, which also seems reasonable (I am talking about list expressions that do NOT use field_name= syntax in them, only a list of values):
header-exprs1.p4(78): [--Werror=type-error] error: h2
h2_t h2 = {};
^^^^^^^^^^^^^
---- Actual error:
header-exprs1.p4(71): Number of fields 0 in initializer tuple<> is different than number of fields 2 in 'header h2_t'
TODO: It seems worth explicitly mentioning in the "Structure-valued expressions" section that it is an error if a structure-valued expression does not provide values for every field of the struct or header, unless it has , ... at the end.
We discussed this in April 2021. It's also related to https://github.com/p4lang/p4-spec/issues/341
Thanks for the link to the earlier issue. I have gone through the comments on #341 and added most or all of the proposed ideas for syntax to represent an invalid header to the bullet list in the original issue, by editing it.
PR #1031 is one proposal for how to address some of the open questions in the comments above, but does not yet have any mention for the primary issue of literal syntax that means "an invalid header". I expect to update that PR, or have a separate one, for defining the literal syntax of an invalid header when that has been agreed upon.
Discussed in 2022-Mar-07 LDWG meeting: Note that ... CAN be used to mean an invalid header when desugaring, with some extra cases when serializing the IR to P4_16 code in p4c. Thus adding a literal syntax for an invalid header is merely additional convenience of expression, not new expressive power in the language.
Created a separate issue for the question of whether ... should be permitted in non-initializing contexts of P4 code here https://github.com/p4lang/p4-spec/issues/1032
@mbudiu-vmw Do you have any thoughts on having no special literal syntax that means only "an invalid header", and somewhat complicating the "IR-to-P4_16" code generation in p4c so that whenever there is an invalid header implied by ..., it remains implied by ... always?
Comment on p4c changes currently being considered to implement ... behavior, that might be relevant when considering what to do about this language spec issue: https://github.com/p4lang/p4c/pull/2368#issuecomment-1063092163 (and my comment replying to it immediately afterwards).
If we allow ... in any position then it can indeed be used to indicate an uninitialized header.
Currently , ... at the end of a list expression means that all other fields of the struct or header that have not yet had a value specified for them should be given their default values.
Would trying to use ... as the last element ever lead to ambiguity of meaning if we also use it to mean an uninitialized header in the last position?
I need to write an initializer like { a = ..., b = 2 }, if a is a header that must be invalid.
@mbudiu-vmw @jnfoster How would you recommend that we resolve an issue like this? Mihai has stated on multiple occasions that deciding upon a literal syntax for an invalid header, one that is not ..., should help in open source p4c finishing its implementation of the language spec. Open source p4c as of 2022-Oct does not yet implement the ... syntax, even though it has been in the language spec since May 2021.
Would it help to just have a straight-out vote on preferred literal syntax, like in this Google sheet? https://docs.google.com/spreadsheets/d/1wtdGUV40frg3NR6lfwOW_kgVgzoUNEh2dYbkkaudofE/edit#gid=0
Just pick something and go?
I'm concerned that this issue will linger for another 1.5 years or more.
I think we should try to reach consensus rather than making a snap decision.
I think the 1.5 years delay is mostly due to lack of interest / pushing from anyone in the LDWG. So with your nudge, we will get this back on the agenda and let's drive it to a conclusion.
@jnfoster Sounds good to me. Happy to do some nudging, if it helps. I know that ... is "only syntactic sugar", i.e. any program written using ... can be written a different way syntactically without ..., but it seems like pretty useful sugar to me.
Fixed by #1240