p4-spec icon indicating copy to clipboard operation
p4-spec copied to clipboard

Add expression to represent an invalid header

Open jfingerh opened this issue 2 years ago • 18 comments

jfingerh avatar Nov 12 '22 21:11 jfingerh

One possible change to the spec that would resolve this issue: https://github.com/p4lang/p4-spec/issues/1021

jafingerhut avatar Nov 12 '22 21:11 jafingerhut

Regarding changelogs, I will beg, repeatedly, to let me modify the change log in separate PRs from the changes. I really, really dislike rebasing spec PRs solely because the change log needs rebasing, sorry.

jfingerh avatar Nov 21 '22 19:11 jfingerh

Fair. But as the person who creates the releases of the spec, let me beg all authors to include a ChangeLog. It's very painful for me to recreate them from the Git history because people forget to add them.

jnfoster avatar Nov 21 '22 19:11 jnfoster

Note I didn't notice an issue with # using outside of the preprocessor until I was trying to implement this in my front-end. Take:

#define line }
header h{}
h f()
{
  return (h){
# line ;
}

Yes this is odd looking source (and one which you hope never see in the real life) but it shows ambiguity in the way the preprocessor and parser interact together. And is one of the reasons why # is not used outside of the preprocessor for C/C++.

apinski-cavium avatar Nov 21 '22 20:11 apinski-cavium

Thought about this, but this is the only ASCII special character that seemed reasonable.

mihaibudiu avatar Nov 21 '22 21:11 mihaibudiu

{:}, {!}, {?}, {-} all seems reasonable alternatives. I am a fan of ! or - here, ? also seems reasonable but not my favorite. - and ! might have grammar issues but I think ? or : would not as the ? is only used in the ?: operator and requires an expression in front of it. {*} might be another option. P4 does not have an unary * operator yet either. Or we could just add a following to (or something similar worded):

`#` cannot be the first non-white space character in a line as it will be used by the preprocessor.

apinski-cavium avatar Nov 21 '22 21:11 apinski-cavium

This is the simplest testcase which shows the issue with # on the front of the line:

header h{}
h f()
{
  return (h){
# };
}

I definitely could see this happening somehow on accident and the developer not understanding why there was an error message either.

apinski-cavium avatar Nov 22 '22 03:11 apinski-cavium

I have added commit 3 to this PR that adds a note that # must not be the first non-white space character of a line, following Andrew Pinski's suggestion in an earlier comment (if we keep the currently proposed syntax using {#}).

jafingerhut avatar Nov 29 '22 02:11 jafingerhut

If the token is {#} then it cannot be the first character of the line...

Generally, tokens are indivisible. For example, the following is not legal P4:

co nt rol C() { 
 t
 a
 b
 l
 e
 t { k   e  y s = {} 
      a c t i o n s = {} 
  }
  ap ply { 
     t.a pp l y();
  } 

jnfoster avatar Nov 29 '22 02:11 jnfoster

If the token is {#} then it cannot be the first character of the line...

Right but the current implementation inside p4c is three different tokens { # } as seen by https://github.com/p4lang/p4c/pull/3667/files#diff-85fbe45d422156081940f1eb7e38cd4ff1bd03574da101de1cf7021a166851c9R187 .

I am ok with {#} being a token but that definitely requires some changes to p4c now.

apinski-cavium avatar Nov 29 '22 03:11 apinski-cavium

If # only ever has meaning when it appears immediately within a set of curly braces, then I think {#} should be a token...

This is a trivial front-end change. Even I could do it :-)

jnfoster avatar Nov 29 '22 03:11 jnfoster

Currently it is 3 tokens - spaces allowed. Having it be 1 token is a bit ugly, but would probably solve the problem.

mihaibudiu avatar Nov 29 '22 17:11 mihaibudiu

Yesterday during the LDWG discussion the idea was floated to use H.invalid for an invalid header. This doesn't work so well when the header type is generic:

header H<T> {
   T field;
}

The syntax is more unwidely: H<bit<32>>.invalid. The currently proposed syntax is (H<bit<32>>){#}.

mihaibudiu avatar Dec 06 '22 18:12 mihaibudiu

The syntax is more unwidely: H<bit<32>>.invalid. The currently proposed syntax is (H<bit<32>>){#}.

I don't have a preference here about them really. I think both are their pros and cons in implementing and even to the usability to the user. The .invalid is more verbose but easier to understand without reading the specifications. While the {#} will allow a "generic" invalid header which might be allow to be used without a "cast" later on.

apinski-cavium avatar Dec 06 '22 18:12 apinski-cavium

I just noticed one thing which is missing (it is a minor issue though). The invalid header literal is not listed as a compile time known constant in the "Compile-time known values" section. I had assumed one of the reasons to do this was explicitly for a compile time known value for an invalid header.

apinski-cavium avatar Dec 06 '22 19:12 apinski-cavium

@apinski-cavium I added commit 5 to this PR that adds (H) {#} to the list of compile-time known values.

jafingerhut avatar Dec 07 '22 04:12 jafingerhut

I see you also added about lists there; I am ok with both changes.

apinski-cavium avatar Dec 07 '22 04:12 apinski-cavium

TODO for Andy when all other open questions about this PR have been resolved: Add changes to grammar.mdk file, if necessary.

jafingerhut avatar Dec 08 '22 18:12 jafingerhut

This can be addressed afterwards but how would be the best way to specify an header union without any valid headers? I would have assumed we could do the following also:

header H1{}
header H2{}
header_union HU { H1 h1; H2 h2; }
HU func() { return (HU){#}; }

but that is rejected currently:

headerunion-1.p4(4): [--Werror=type-error] error: cast: invalid header expression has a non-header type `header_union HU`
HU func() { return (HU){#}; }
                   ^^^^^^^
headerunion-1.p4(3)
header_union HU { H1 h1; H2 h2; }
             ^^

apinski-cavium avatar Jan 09 '23 18:01 apinski-cavium

One way to do it is to use any of the headers in the union. Haven't tried to see if it works, though.

mihaibudiu avatar Jan 09 '23 18:01 mihaibudiu

That probably won't work. We'll have to define {#} for unions too.

mihaibudiu avatar Jan 09 '23 18:01 mihaibudiu

We decided to make {#} a single token.

mihaibudiu avatar Jan 09 '23 21:01 mihaibudiu

@jafingerhut I think only a tiny change to the grammar is needed: https://github.com/p4lang/p4c/pull/3835

mihaibudiu avatar Jan 10 '23 20:01 mihaibudiu

2023-Mar LDWG meeting Nate pointed out that this set of issues are very similar to each other, both in how they are specified and likely how they are implemented in p4c. It would be nice if they had a consistent definition in terms of whether the (typeRef) before them was considered a cast or part of the syntax of the expression, and if it is not a required part of the expression, what is the type of the expression when there is no cast. There may be other related questions raised in the LDWG discussion that I am not capturing here.

  • This issue
  • https://github.com/p4lang/p4-spec/pull/1220
  • Perhaps also this one? Not sure: https://github.com/p4lang/p4-spec/pull/1218

@jnfoster If you have any thoughts on what you would like to see in the spec for these, please let us know.

jafingerhut avatar Mar 14 '23 18:03 jafingerhut

My knee-jerk reaction is that I prefer to keep terms that provide some new feature (e.g., invalid headers, struct literals, etc.) separate from terms that convert types (e.g., casts). We have bonded these two together in some cases but not others.

But as a general scheme:

  • terms that provide new feature, but are possibly hard to type check
  • casts that are allowed in specified contexts

jnfoster avatar Mar 15 '23 18:03 jnfoster

@jnfoster Do you have a recommendation for what spec language you would like to see for this PR and the ones linked above in this comment: https://github.com/p4lang/p4-spec/pull/1184#issuecomment-1468638149

such that the resulting PRs would be as consistent as you would like them to be with each other, and the rest of the spec?

jafingerhut avatar Mar 28 '23 18:03 jafingerhut

I don't really care about the concrete syntax. My wish was to separate constructs that denote polymorphic literals like invalid headers from casts, and to do this consistently throughout the language.

jnfoster avatar Mar 29 '23 12:03 jnfoster

I may have misunderstood some of the discussion in the March 2023 LDWG meeting. I was under the impression that some of your concerns expressed there would require at least one, or all of:

  • some changes to this PR
  • some change to the PR introducing an expression for invalid header union values
  • perhaps even changes to other parts of the spec to clarify or change the syntax that "looks like a cast" for structure-values expressions, to clarify what that actually is that is not a cast. (I am unclear whether it technically is a cast, or not.)

If this PR and the following ones are ready to be approved and merged in as they are written now, then I got confused in the last meeting that for some reason, they are not ready.

  • https://github.com/p4lang/p4-spec/pull/1220
  • Perhaps also this one? Not sure: https://github.com/p4lang/p4-spec/pull/1218

jafingerhut avatar Mar 29 '23 18:03 jafingerhut

The problem with #1220 and #1218 from my perspective is that they require a cast to appear along with the invalid header.

jnfoster avatar Apr 03 '23 19:04 jnfoster

If we change those PRs so that they do not require a cast, then is it important to you if there is a defined type for the expression {#} by itself? If so, should it be some new type that is "an invalid header or header union value of some type, but it could be any header type or header union type"?

jafingerhut avatar Apr 04 '23 02:04 jafingerhut