p4-spec icon indicating copy to clipboard operation
p4-spec copied to clipboard

Assigning invalid header to union member vs Assigning to fields of invalid union member

Open MollyDream opened this issue 4 years ago • 27 comments

In 8.18, the spec illustrates with an example that assigning an invalid header to a union member is the same as setting the union member to invalid.

u.hi = e
// is equivalent to
u.hi.setInvalid(); // when e is invalid

Further, it explains that u.hi.setInvalid() means that "if the valid bit for any member header of u is true then sets it to false, which implies that reading any member header of u will return an unspecified value."

Adding these together, writing an invalid header to a union member invalidates all the union members.

In contrast, in 8.22, the spec states that "if a write is attempted to a field of the invalid member header u1.h2, then any or all of the fields of the valid member header u1.h1 may change as a result. Such a write must not change the validity of any member headers of u1, nor any other state that is currently defined in the system, whether it is defined state in header fields or anywhere else."

This means that writing to a field of an invalid member header does not invalidate other member headers.

We are wondering if there is any particular reason for such a discrepancy. Is it motivated by the behavior of existing switches?

We tested with the v1model, but it does not invalidate other header members in both situations.

header H1 {
  bit<8> f;
}
header H2 {
  bit<16> g;
}
header_union U {
  H1 h1;
  H2 h2;
}

{
  U u;
  H1 my_h1 = { 8w0 }; // my_h1 is valid
  u.h1 = my_h1;

  H2 my_h2; // my_h2 is invalid
  u.h2 = my_h2;
  if (u.h1.isValid()) {
    h.output = 0xDCBA;
  } else {
    h.output = 0x0101;
  }
}

With the above P4 code snippet, the v1model writes 0xDCBA to h.output. The behavior is the same if u.h2 = my_h2; is changed to u.h2.g = 1;

MollyDream avatar Dec 12 '21 16:12 MollyDream

I do not understand why you say there is a discrepancy.

Both of the parts of the spec you quote seem consistent to me.

Section 8.18 defines the behavior of assigning a value to an entire header member of the entire variable with type header_union.

Section 8.22 defines the behavior of assigning a value to one field of a header member of the union, if that header is invalid.

Each defines the behavior of different assignment statements, and neither one affects the other directly, as far as I can see, but please explain further if you see an inconsistency here.

jafingerhut avatar Dec 12 '21 17:12 jafingerhut

Regarding why things are defined this way, I suspect that header_union's might be one P4_16 feature that is not implemented on all ASIC implementations of P4, because of the extra difficulty involved, and there being other ways to achieve similar effects (e.g. manually write code that defines N different headers, put them in a struct, and manually maintain the invariant that at most one of them is invalid).

jafingerhut avatar Dec 12 '21 17:12 jafingerhut

Hi @jafingerhut Sorry about the confusion. By discrepancy, I actually meant "difference". (I just realized that "discrepancy" implies "a difference especially between things that should be the same.")

MollyDream avatar Dec 13 '21 16:12 MollyDream

What we want to ask is whether this difference really a designed feature?

QinshiWang avatar Dec 13 '21 16:12 QinshiWang

Maybe @vgurevich will have a better idea about Andy's suspicion. Is header union implemented/allowed in the Tofino architecture?

MollyDream avatar Dec 13 '21 16:12 MollyDream

A little bit more background rationale for why it is defined the way it is now. I am not saying that it must not change -- only trying to explain rationale for the way the spec is currently written.

For header_union, there should be some operation in the language that changes its state from "exactly one member is currently valid" to "no member is valid". There are design choices in what that operation is, but currently there are two ways, as you point out in section 8.18. I recall at one point early in 2017 the spec was written such that the only way to make a header_union variable invalid, if you did not already know which member was valid, was to call hdr_union_variable.member1.setInvalid(); hdr_union_variable.member2.setInvalid(); // etc ... one call to setInvalid per header_union member. That seemed like a bad design choice to me, and I believe it was then that the proposal was made that hdr_union_variable.setInvalid() could make all members invalid, with a single call.

Section 8.22's text you mention is basically giving P4 compiler developers the ability to actually share storage between header_union members in physical storage at run time. There are no features built into the P4 language to statically (i.e. at compile time) easily prevent accesses to invalid members of a header_union. So, what should the spec say is the behavior if one tries to assign to a field in a member header that is currently invalid, when another member is valid?

The current spec gives the implementers the freedom to just implement the assignment to the header by writing to the storage area it has allocated for the header_union variable as a whole. That might have the side effect of overwriting any or all of the bits of storage that represents the value of the currently valid member header. Thus the P4 programmer should expect that this might have happened, and ideally write their programs so they never make such assignments, if they do not want that to happen.

If we wanted to write a spec that says "an assignment to a field in a currently invalid header_union member has zero side effects", that is a greater imposition of static analysis and/or run-time cost to implement that behavior. Either it must statically prove that no other member is currently valid and do the assignment, or prove that another member IS currently valid and skip the assignment, or have a run-time check whether any other member is currently valid to decide whether to perform or skip the assignment. The current spec obviously didn't go that direction.

jafingerhut avatar Dec 13 '21 16:12 jafingerhut

https://github.com/p4lang/p4-spec/issues/995#issuecomment-992668650 Yes. We understand that the data of different header members in a header union are shared. We are wondering about the desired behavior of the validity bits.

QinshiWang avatar Dec 13 '21 17:12 QinshiWang

As for the validity bits, I think that they are "are always there", whether or not a particular header is valid or invalid, otherwise the basic validity checks would not work.

Therefore, I think that there is a reasonable assumption that they are going to be kept separately from the fields and are not going to be shared between headers, which is why changing validity bits of one of the headers in the header union should not be affecting the validity of the other headers. Tofino implementation does, indeed, follow this assumption, but I do not think that it is Tofino-specific in any way.

vgurevich avatar Dec 13 '21 19:12 vgurevich

@vgurevich Even for setValid() for a header member while another header member is also valid? That could result in both header members being valid.

QinshiWang avatar Dec 13 '21 21:12 QinshiWang

@QinshiWang -- that's precisely the point of header_union -- is to prevent them from happening :) But technically, yes, that's pretty easy to do. In other words, when the compiler sees the code

    u.h1.setValid();

it automatically adds (should automatically add)

   u.h2.setInvalid();
   ...
   u.hN.setInvalid();

(because u.h1 ... u.hN are the members of a header union.

When header unions are not used, then it is a responsibility of the P4 programmer, which makes header management more flexible and more conducive to various optimizations, but also more dangerous

vgurevich avatar Dec 13 '21 21:12 vgurevich

@vgurevich We are developing the mechanized semantics of P4. Although we can define it by automatically inserting statements, it's more clear to not do so. So we define the semantics of u.h1.setValid() as setting h1 valid and others invalid.

QinshiWang avatar Dec 13 '21 21:12 QinshiWang

@vgurevich But how about u.h1.setInvalid()? Would the compiler insert the follows?

u.h2.setInvalid();
...
u.hn.setInvalid();

QinshiWang avatar Dec 13 '21 21:12 QinshiWang

I do not think this is really needed to keep the invariant, but it can certainly be done.

vgurevich avatar Dec 13 '21 21:12 vgurevich

I think it would be wrong if h1 was not valid to start with.

mihaibudiu avatar Dec 13 '21 21:12 mihaibudiu

+1 to @mbudiu-vmw

vgurevich avatar Dec 13 '21 21:12 vgurevich

@mbudiu-vmw @vgurevich Are you saying it would be wrong to attempt to perform the statement u.h1.setInvalid() if u.h1 is currently invalid?

If so, why do you say that?

For normal headers, it is perfectly legal and defined behavior to call setInvalid() on a currently invalid header, or to call setValid() on a currently valid header, do you agree? It would be a waste of time to do so if you already knew the header's current validity state when writing such a statement, agreed. But it is still useful in a part of a P4 program that could be reached where the header might be valid, or might not, and you want to guarantee the resulting state as one way or the other.

jafingerhut avatar Dec 13 '21 22:12 jafingerhut

No, it would be incorrect to invalidate all other headers when performing u.h1.setInvalid().

mihaibudiu avatar Dec 13 '21 22:12 mihaibudiu

According to the current spec, that is exactly what is defined to happen, and has since version 1.0.0 in May 2017, I believe. See Section 8.18 "Operations on header unions"

jafingerhut avatar Dec 13 '21 22:12 jafingerhut

Note: If I recall correctly, this was a change made shortly before the 1.0.0 release of the spec in May 2017, and Nate and I made it while reviewing the header union operations. I pointed out that without the currently defined behavior where u.h1.setInvalid() on a single union member making them all invalid, there was no operation defined to make all members of a union invalid EXCEPT by writing u.h1.setInvalid(); u.h2.setInvalid(); u.h3.setInvalid(); ... for all members, and that seemed like we were missing a useful operation, namely some single statement that could guarantee all members were invalid.

Perhaps we could revisit that decision, and introduce a brand new operation, e.g. u.setInvalid() that guaranteed all members became invalid, regardless of which member was valid, even of none were currently valid, and change the behavior of u.h1.setInvalid() to make no change in state if a different member other than h1 is currently valid. At this point that would be a backwards-incompatible change.

jfingerh avatar Dec 16 '21 13:12 jfingerh

Huh. I don't recall this, but May 2017 was a long time ago :-)

I like your proposed solution. @QinshiWang and @MollyDream what do you think?

And I'm not so concerned with backwards compatibility -- I don't think header unions are widely used yet, and I doubt there is code that depends on this feature. Also, there's an easy way to port code written with the old semantics to the new one. So it shouldn't be too disruptive to get this right.

jnfoster avatar Dec 16 '21 13:12 jnfoster

I also like Andy's new proposal better. It is more clear from the operation itself (u.setInvalid()) that it sets all members invalid, while looking at u.h1.setInvalid(), it seems reasonble to infer that it invalidates only the member h1.

MollyDream avatar Dec 16 '21 15:12 MollyDream

To have an operation u.setInvalid() is certainly good, but there's still the problem. What if we do u.h1 = h? If the result depends on the validity of h, will it be a problem?

QinshiWang avatar Dec 16 '21 20:12 QinshiWang

I had assumed u.h1 = h would:

  • Be equivalent to u.h1.setInvalid() if h is invalid.
  • And otherwise it would make u.h1 valid (thereby making any other elements of the union that might be currently valid now invalid) and assign the bits in the header. What's the problem?

jnfoster avatar Dec 16 '21 20:12 jnfoster

I will let @QinshiWang give his answer, but perhaps he is concerned that given the behavior described in the previous comment, u.h1 = h should be a no-op if h is invalid and u.h2 is currently valid (and that might be either expensive to implement, and/or even surprising behavior for a P4 developer).

jafingerhut avatar Dec 16 '21 21:12 jafingerhut

Right...

jnfoster avatar Dec 16 '21 21:12 jnfoster

The problem is u.h1 = h will be expanded into

if(h.isValid()) {
  u.h1.setValid();
  u.h2.setInvalid();
  ...
  u.hn.setInvalid();
  u.h1.values = h.values;
}
else {
  u.h1.setInvalid();
}

The if-statement here looks unhealthy?

QinshiWang avatar Dec 16 '21 21:12 QinshiWang

2022-Mar-07: Some discussion of this issue, including:

  • some history of why the spec has been this perhaps odd state since P4_16 version 1.0 in 2017
  • Dan Talayco mentioned that it seemed strange that there are setValid/setInvalid operations on members of a header union at all, versus validity being a concept that applies only to the header_union as a whole. Nate agreed that this is odd, and stems from header_union members being of type header, which the language spec defines setValid and setInvalid operations on.

Please add more comments if you wish. I know that the above does not capture it completely (and perhaps not even correctly for what it says).

jafingerhut avatar Mar 07 '22 22:03 jafingerhut

2023-Jun-05: While discussing the backwards-incompatible change proposed in this PR: https://github.com/p4lang/p4-spec/issues/995

A question was asked about how many P4 developers actually use header_union. It definitely does not seem to be a commonly-used feature.

I (Andy) asked a P4 developer within Intel working on our IPU products, and she is using header_union in some of her work. She said that she did not know for a fact that it was required for her to get her work done -- more that it seemed to help express some things like at-most-one-of IPv4 or IPv6 header at a particular "layer" of the header sequence.

jafingerhut avatar Jun 05 '23 21:06 jafingerhut

It seems 99% (if not 100%) of users use header_unions in a very conservative way. So maybe we can characterize "good" usage, and regulate ill-formed usage to yield unspecified values. (Targets may guarantee more specific behavior. That's fine.)

QinshiWang avatar Jun 06 '23 04:06 QinshiWang

In the interest of tidying up the set of active issues on the P4 specification repository, I'm marking this as "stalled" and closing it. Of course, we can always re-open it in the future if there is interest in resurrecting it.

jnfoster avatar Nov 11 '23 13:11 jnfoster