p4-spec icon indicating copy to clipboard operation
p4-spec copied to clipboard

Clarification regarding Exit statement

Open usha1830 opened this issue 1 year ago • 19 comments

Should the pipeline processing continue or not after an exit statement?

Control ingress( inout my_ingress_headers_t hdr, inout my_ingress_metadata_t meta,
                          in pna_main_input_metadata_t istd, inout pna_main_output_metadata_t ostd) {

      action send_nh(@tc_type("dev") PortId_t port, @tc_type("macaddr") bit<48> srcMac, @tc_type("macaddr") bit<48> dstMac) { ... } 

      table nh_table { 
          key = { 
             hdr.ipv4.dstAddr : exact @tc_type("ipv4") @name("dstAddr"); 
          } 
          actions = { ... } 
          size = L3_TABLE_SIZE; 
          const default_action = drop; 
      } 
      apply { 
          exit;
          nh_table.apply();
      }
 }

control Ingress_Deparser( packet_out pkt, inout my_ingress_headers_t hdr, 
                                          in my_ingress_metadata_t meta, in pna_main_output_metadata_t ostd) {
      apply { 
          pkt.emit(hdr.ethernet); pkt.emit(hdr.ipv4); 
      } 
} 
PNA_NIC( Ingress_Parser(), ingress(), Ingress_Deparser()) main;

When the program executes the exit statement in the control block's apply section, it will stop the processing of the control block (so won't execute nh_table.apply()). However it will continue the pipeline processing so the deparser will still be executed.

Is this behavior appropriate?

CC: @vbnogueira @jhsmt @komaljai @sosutha

usha1830 avatar Feb 05 '24 10:02 usha1830

In the P4_16 language specification exit is defined in its behavior to exit the currently enclosing top level control. What happens after that control exits is up to the definition of the architecture being implemented by the device.

For example, one architecture might be defined to always execute the deparser for all packets after the previous control has finished executing.

A different architecture could be defined such that if extern foo was called during a control's execution, the deparser will not be executed, but if foo was never called, the deparser will be executed.

Note: These are just examples to indicate potential differences between different P4_16 architectures.

jafingerhut avatar Feb 05 '24 13:02 jafingerhut

As far as I can recall in the PSA and PNA architectures, it should technically not matter whether the deparser is executed or not for packets that are requested to be dropped in the ingress/egress/main control, because it can have no side effects (i.e. no effects that are visible to any packet other than the one currently being processed). If the packet is being dropped, whether it is dropped before or after the deparser would thus not be detectable from any external tests that I can think of.

jafingerhut avatar Feb 05 '24 19:02 jafingerhut

We don't want to define a new architecture, rather the goal for P4TC is to have "functionally equivalence" to the hw datapath architecture - in this case PNA. I am not sure what it would mean to run the deparser in s/w after an exit. If the headers were to be over-written based on some collected metadata before the exit, should we allow for that? My view is, for robustness sake, an exit should qualify as "stop further processing of this packet and drop it". For P4TC that would return TC_ACT_SHOT.

jhsmt avatar Feb 05 '24 23:02 jhsmt

Note that P4 exit does not imply that the packet should be dropped, necessarily. Whether a packet is sent on forward or dropped is determined by other factors determined by the architecture, e.g. for PNA, whether the drop_packet() extern function has been called. You can execute an exit statement without ever executing drop_packet(), and if so, the packet should be sent on without being dropped.

jafingerhut avatar Feb 06 '24 02:02 jafingerhut

Should PNA and PSA should call out the behavior in this regard or is it the spec that needs to be more specific? From the spec: "The exit statement immediately terminates the execution of all the blocks currently executing: the current action (if invoked within an action), the current control, and all its callers. exit statements are not allowed within parsers."

Note no mention of deparser above. The ultimate "caller" of a control block in P4TC is the core driver code (in case of XDP) or tc core that enters the ebpf program. One translation of the above is to "exit" the ebpf program (which happens to include the deparser).

My main concern is the ambiguity. Just because deparser wasnt mentioned in the spec does not explicitly mean, to a human, you should proceed to execute the deparser on exit. And when there is ambiguity in packet processing i would rather drop the packet and log it somewhere. If i was to dogmatically interpret the spec, exit is equivalent to "goto deparser".

In s/w exit() means you stop all processing. There may be a registered callback (analogous to a default action) that gets invoked to process the exit result (see man exit) but what is explicit is you dont process the next component. I understand in hardware this may tricky to do (you cant just skip the whole pipeline).

jhsmt avatar Feb 06 '24 12:02 jhsmt

There may be a registered callback (analogous to a default action) that gets invoked to process the exit result (see man exit)

This is what happens in P4 too. When exit is invoked, all processing in the current P4 block is halted and control passes back to the architecture. So the "registered callback" is the architecture, which can truly do whatever it wants including: proceeding to the next block, re-executing the block, dropping the current packet, bricking the device, and so on.

So in my view, this really isn't a question about the semantics of the P4 Language, which is perfectly clear. It's a question about what specific architectures like PSA or PNA may choose to do when exit is invoked from one of its parser or control blocks. Note however that if the architecture wants to do something special when exit is invoked, then it will need some information to distinguish normal executions from executions that terminate due to exit. And because P4 lacks exceptions or other return signals, it's not clear to me where that information would come from... Fixing that could be a reason to revisit the language semantics.

jnfoster avatar Feb 06 '24 12:02 jnfoster

This is what happens in P4 too. When exit is invoked, all processing in the current P4 block is halted and control passes back to the architecture. So the "registered callback" is the architecture, which can truly do whatever it wants including: proceeding to the next block, re-executing the block, dropping the current packet, bricking the device, and so on.

Thanks for the clarification. Makes sense for PNA to call it out then. The compiler could be taught to add gotos in the generated code (or even separate each block, parser, deparser into its own ebpf program with an "architecture" dispatch loop making the decisions on how to proceed based on our TC_ACT_XXX return codes). I still feel drop would be the most sane default action for exit and perhaps we need an atexit() definition, analogous to default miss action for example, that may be used to override from within a P4 program what the architecture defaults to.

So in my view, this really isn't a question about the semantics of the P4 Language, which is perfectly clear. It's a question about what specific architectures like PSA or PNA may choose to do when exit is invoked from one of its parser or control blocks. Note however that if the architecture wants to do something special when exit is invoked, then it will need some information to distinguish normal executions from executions that terminate due to exit. And because P4 lacks exceptions or other return signals, it's not clear to me where that information would come from... Fixing that could be a reason to revisit the language semantics.

I am wondering if the original thought was to emulate the s/w equivalent of exit() or this was driven by "this is the only way we could make it work because hardware works this way"

jhsmt avatar Feb 06 '24 13:02 jhsmt

If you think of a top level control as a process in software, then P4's exit seems pretty equivalent to me to exit() in Unix/Linux.

If you think of the processing of an entire packet, including whatever happens to it after it finishes executing a P4 control, as the thing to correspond with a Unix/Linux process, then they are definitely different.

I am not aware of any hardware-specific focus of P4's exit statement here. It is a control flow construct that enables P4 developers to immediately terminate all execution within the top level P4 control that is currently executing

jafingerhut avatar Feb 06 '24 18:02 jafingerhut

If you think of a top level control as a process in software, then P4's exit seems pretty equivalent to me to exit() in Unix/Linux.

If you think of the processing of an entire packet, including whatever happens to it after it finishes executing a P4 control, as the thing to correspond with a Unix/Linux process, then they are definitely different.

Probably the use of the noun "exit" is misplaced then since that is very tightly coupled in the linux world? To build on your analogy: unix process == packet pipeline; meaning exit() translates loosely to stopping the pipeline processing. Then the software signal handler for exit() equates to the architecture's definition on what should happen when exit is encountered. Which i suggested we should be able to override from a P4 program.

I am not aware of any hardware-specific focus of P4's exit statement here. It is a control flow construct that enables P4 developers to immediately terminate all execution within the top level P4 control that is currently executing

Absent PNA saying whether we should go the deparser or not currently - shall we say default behavior should be to drop (for P4TC s/w it makes sense to avoid unpredictable behavior)?

jhsmt avatar Feb 06 '24 21:02 jhsmt

I believe PNA today does say that if you call drop_packet(), the packet should be dropped, and if you do not call drop_packet(), it should not be dropped.

That is completely independent of whether you use a P4 exit statement, and I am happy to ask other PNA implementers whether that is their understanding as well, it seems best to me to follow that approach.

jafingerhut avatar Feb 06 '24 22:02 jafingerhut

I believe PNA today does say that if you call drop_packet(), the packet should be dropped, and if you do not call drop_packet(), it should not be dropped.

And if i am not mistaken PSA has a built-in default of dropping packets (nothing to do with exit either).

That is completely independent of whether you use a P4 exit statement, and I am happy to ask other PNA implementers whether that is their understanding as well, it seems best to me to follow that approach.

In my opinion it leads to unpredictable behavior for exit to mean "goto deparser" (see earlier example). For P4TC i think we'll stick to dropping the packets unless there is a good reason not to.

jhsmt avatar Feb 07 '24 13:02 jhsmt

We can add drop_packet() call from compiler while processing exit statement if we want to impose this for TC backend, if required.

usha1830 avatar Feb 07 '24 13:02 usha1830

@jhsmt Suppose in a few months, a P4TC P4 developer comes to you and says:

"Hey, I have a fancy main control I am trying to write in P4TC. I get to this point X in my code, and I have set things up so everything is exactly like I want it to be. I don't want to update any more counters, or any more register externs. I don't want to run any of the rest of the code in my main control, which might do those things. But I want to send the packet out of the device, just like it is now. How can I do that?"

If you implement exit the way the language spec says, the answer is "Write an exit statement at point X in your program."

If you implement exit the way I think you are planning on, the answer is "Change the logic of the rest of. your program after point X so that it does not update any state, nor any packet headers, e.g. perhaps with some new boolean flag you assign true at point X, and use in N new if conditions after point X to disable modifications that you do not want to happen."

What I would recommend is: Let the P4 developer use exit to mean what it means now. If they want to drop the packet, then they just put a call to drop_packet() just before that exit statement, and it will drop the packet. If the do not want to drop the packet, then they won't execute drop_packet() in the code path leading up to that exit statement.

jafingerhut avatar Feb 07 '24 14:02 jafingerhut

@jafingerhut, you make good points; there is potential that someone would want to say "change the headers but if you see foo then yeah, skip those extra things and jump to the deparser and send the packet out with those changed headers". I just cant think of a good reason for that, but it would be unreasonable of me to say that it is an invalid case. Whoever decided that "exit means goto deparser" must have had a good reason. At minimal i feel it needs to be called out in PNA. And it should be possible to override like with the atexit() syntax.

jhsmt avatar Feb 07 '24 19:02 jhsmt

To be clear, no one decided that "exit means go to deparse".

It means "immediately exit the current top level control that is being executed".

What happens after that is up to the architecture.

jafingerhut avatar Feb 07 '24 19:02 jafingerhut

To be clear, no one decided that "exit means go to deparse".

It means "immediately exit the current top level control that is being executed".

What happens after that is up to the architecture.

And in PNA it is just jumping to the deparser... do we need to be more explicit?

jhsmt avatar Feb 07 '24 20:02 jhsmt

Agreed, and in the PNA architecture it is defined that the control flow is "execute MainParser, then execute MainControl, then execute the MainDeparser". (that is not a direct quote from the PNA specification, but I believe is implied by its contents).

This is not that different from PSA defining that the ingress control flow is "execute IngressParser, then execute IngressControl, then execute IngressDeparser". THEN do whatever the PSA specification says about what happens to packets after the ingress parser, but before the traffic manager (because packets might be dropped before ever getting to the traffic manager at that point). The traffic manager might clone and/or multicast or unicast packets. Then it says to do "Egress Parser, then EgressControl, then EgressDeparser".

Sure, we could in the PNA specifications say: "By the way, the control flow is MainControl, then MainDeparser, even if the MainControl causes an exit statement to be executed."

I know you aren't asking for this, but we could also list a dozen other things besides an exit statement, e.g. assignment statements, table apply calls, return statements, extern method calls, etc., and that statement would still be true. I understand that exit might cause more confusion in this area for some people than all of those other statements, but I wanted to point out that if one has a proper mental model of how exit is defined in P4, then exit is "just another thing that can be done in any control".

jafingerhut avatar Feb 07 '24 20:02 jafingerhut

Ok, let me make my last comment and hopefully we can close this.

In my view, the bottom line is whether exit() should imply a) stop processing the control block (as specified in the lang spec) or b) stop processing the pipeline. @jafingerhut and @jnfoster both made the point for $a: that it is upto whoever wrote the program to understand the specified semantics and to use exit to stop the control block processing and that architectures can extend the behavior if deemed necessary. I made the claim current defined behavior is impractical, at least for P4TC, but concede that it may be intent. IMO, we need choice of one or the other in PNA which can default to current spec.

Reiterating: there is no distinction between the different pieces (control, deparser) in P4TC, it is the same piece of generated code; i would expect most s/w implementation would likely be along the same lines. Potential solutions (which we could keep as P4TC specific):

  1. Maintain spec behavior: exit from the control block could result in generated code goto which jumps to the deparser code. This will match current spec.
  2. When encountering exit() we drop the packet (and undo anything destructive) - something i was pitching earlier. Needs to be documented as such.
  3. Make it a compile choice to pick between 1 and 2 - much better approach. Needs to be documented as such.

jhsmt avatar Feb 08 '24 19:02 jhsmt

After some more internal discussions - I would like to take back everything i said. Lets stick to what the spec says. We need to do more testing of both exit and return and verify the correct code is being generated.

jhsmt avatar Feb 13 '24 15:02 jhsmt