riscv-fast-interrupt icon indicating copy to clipboard operation
riscv-fast-interrupt copied to clipboard

This is the final version of preemptible interrupt handler of section 7.2 (no issue as such)

Open David-Horner opened this issue 5 years ago • 6 comments

This is the final version of preemptible interrupt handler of section 7.2.

It is possible by allowing mnxti to handle pending interrupts at the same level as current. This code assumes that functionality will be enabled by using "csrrsi a1,mnxti,1".

The greatest benefit from using this functionality is reduced time to handle very high interrupt loads. Specifically, 12 instructions with no stack r/w and no interrupt trapping.

I hope the comments are comprehensive and self explanatory.

   .p2align 3
      # Inline preemptible interrupt handler.
      # Only safe for horizontal interrupts.
      #
      # In this version B 2.0:
      #    Selectively hardware vectored interrupts will potentially be substantially
      #       delayed as no interrupts are enable until epilogue.
      #    The worst case delay is from an intrpt2 section bnez to intrpt1
      #       through to interrupts enabled in the epilogue. 24 instructions. 
      #
      # Note: This code uses an extension instruction:
      #          Branch Bit Set (bbs) proposed for the Fast Interrupt Extension
      #       Also, a variant of the mnxti csr that allows same level pil selection,
      #                also proposed for Fast Interrupt  
      #
      # This version does not handle the type 1 interrupt processing through an 
      #    interrupt, but through a mnxti branch sequence. As a result the INT1
      #    value is not enabled in the sp, but rather INC2 is always used to indicate
      #    second (and subsequent) time through the primary entry point.  
      #
      #
      # Non preempted code path is 28 instructions including:
      #       5 stores, 4 loads, one amo, mret and 6 csr: 
      #       2 csr read, 2 csr write,
      #       1 csr modify (MIE on) and 1 special (a0,mnxti,1).
      # 
      # Minimum handle code is 12 instructions, with no stack lw/sw nor hardware trap.
      #   This is the code path from intrpt1 through pending check back to intrpt1.
      # It is during highest interrupt frequency that this code is in play, 
      #    which is a very favourable characteristic.
      # It is also the worst latency case as 12 instructions execute between checks. 
      #
      # The total count for one pass through each of the 3 paths is
      #  28 + /*first entry to epilogue mret (worst case, best is 19)*/
      #   3 + /*2nd type preemption (no pending) */
      #  13 + /*1st type interrupt (pending int flag clear & accum)*/
      #  19   /* continue to complete current int flag clear and accum and return*/
      #  Total 63 instructions, for an of average 21.
      #
      # Note: With prerequisites met, a single copy of the code handles the low
      #         2K vectored interrupts; All 4K can be handled by replacing the
      #         andi with a slli;srli pair.(and increase the code path by one instruction)
      #         Lower values in the andi are possible to handle 1K, 512, etc. counters.
      #
      #
      # Prerequisites
      # 0. sp is used in non-standard ways in this code but no restrictions on where
      #        sp points in memory.
      #    Minimum alignment is 4 bytes but 8 or more optimizes compressed load/stores.
      # 1. Normal sp management preserving alignment must be performed by other code.
      # 2. It is assumed that routines that use this handler are at the highest 
      #      interrupt priority. Thus this code will only be preempted by this code.
      #    If this code can be preempted by selectively hardware vectored interrupts,
      #      their handler must also accommodate the INT2 flag in the sp. 
      #      (That is tolerate 1/2 the normal sp alignment).        
      # 3. Interrupt flag and counter offsets from their base
      #      are both scaled from exccode in mcause
      # 4. The base for INTERRUPT_FLAGS and COUNTERS should be 4K aligned
      #      this save 2 instructions using lui to provide base and offsets.
      # 5. This code assumes all 2K/4K interrupts are COUNTER accumulators.
      #      Any interrupts can be used for other purposes but there will be 
      #      corresponding holes in the COUNTER array.
      #              
      #
   .equ  STCK_ALIGN,16             # in bytes (minimum of 4 is required)
                                # with 8 byte aligned stack lw & sw are compressed
   .equ  INC2,STCK_ALIGN/2         # 
   .set  COUNTERS,COUNTER_ARRAY    #
   .set  INTRPT_FLAGS,INTERRUPT_FLAGS_ARRAY
                                   #
   foo:
      #----- Interrupts disabled on entry ---#
      bbs sp, INC2, intrpt2        # do subsequent time through processing
      #----- First time in interrupt context ----#
      addi sp, sp, -FRAMESIZE+INC2 # advance sp and flag for intrpt2
      sw a0, OFFSETa0-INC2(sp)     # Save working register.
      sw a1, OFFSETa1-INC2(sp)     # Save working register.
      csrr a0, mepc                # Read epc.
      sw a0, OFFSETepc-INC2(sp)    # save epc
      csrr a0, mcause              # Read cause.
      sw a0, OFFSETcause-INC2(sp)  # save mcause
      # if interrupt pending: to intrpt1 and try to catch up
      csrrsi a1,mnxti,1            # new variant to allow equal pil pending interrupt
      bnez intrpt1                 # process pending interrupt.
   advance:
      # a0 = current mcause
      andi a0,a0,0x7FF             # mask off low 2K entries of exccode
      la a1, INTRPT_FLAGS
      add a1, a0, a1
      sw x0, (a1)                  # Clear current interrupt flag.
      la a1, COUNTERS              # 
      slli a0,2                    # scale to words for counters
      add a1, a1, a0
      li a0,1
      amoadd.w x0, (a1), a0        # increment current counter in memory.
      # ---- we've caught up (or had no interrupts) #
      csrrsi x0, mstatus, MIE      # Enable interrupts.
      #----- Interrupts enabled ---------# 
      #----- No other critical section until interrupt or mret ----#
      lw a1, OFFSETa1-INC2(sp)     # Restore a1.
      lw a0, OFFSETepc-INC2(sp)    # Restore original epc
      csrw mepc, a0                # Put epc back.
      lw a0, OFFSETcause-INC2(sp)  # Restore original cause
      csrw mcause, a0              # Put cause back.
      lw a0, OFFSETa0-INC2(sp)     # Restore a0.
      addi sp, sp, FRAMESIZE-INC2  # Restore sp
      mret                         # Return from handler
      #
      #  following code executes as a result of preemption
      #
   intrpt1:
      #----- Interrupts disabled  ---------#
      # a0 = prev mcause
      andi a0,a0,0x7FF             # mask off low 2K entries of exccode
      # if all 4K entries are desired replace andi here and above with
      #slli a0,a0,20                # mask off all 4K entries of exccode
      #srli a0,a0,20
      la a1, INTRPT_FLAGS
      add a1, a0, a1
      sw x0, (a1)                  # Clear previous interrupt flag.
      la a1, COUNTERS              # 
      slli a0,2                    # scale to words for counters
      add a1, a1, a0
      li a0,1
      amoadd.w x0, (a1), a0        # increment previous counter in memory.
   intrpt2:
      csrr a0, mcause              # Read cause.
      csrrsi a1,mnxti,1            # new variant to allow equal pil pending interrupt
      bnez intrpt1                 # process pending interrupt.
      b advance

      #------------------------------------#

David-Horner avatar Sep 14 '20 19:09 David-Horner

Without the new variant for mnxti to allow equal pil pending interrupt, pil in mcause needs to be reduced by 1. fortunately mcause is in a0 when csrrsi a1,mnxti,1 would be executed, so we can change the new instruction into lui a1,PILbit0 # low bit location of mpil in mcause sub a0,a0,a1 # reduce mpil in mcause by 1 csrrw x0, mcause, a0 # write it back to activate it csrrci a1,mnxti, MIE # using reduced pil continue with interrupts disabled

This is costly. The write to mcause especially needs to complete before mnxti executes. In any case, 3 more instructions including the mcause write back.

David-Horner avatar Sep 22 '20 23:09 David-Horner

Hi David, I'm a little confused why you want to allow equal pil pending interrupt

here is an example: interrupts 1,2, programmed to all be at interrupt level 1 interrupts 11 programmed to be at interrupt level 10

My understanding of behavior as currently defined: running code -> interrupt 2 occurs pil set to 0, start processing interrupt 2 -> interrupt 11 occurs pil set to 1, preempt interrupt 2, start processing interrupt 11 -> interrupt 1 occurs pil still set to 1, keep processing 11 interrupt 11 finishes, use mnxti to see if any interrupts are pending, none are above level 1 so return to interrupt 2 finish processing interrupt 2, use mnxti to see that interrupt 1 is waiting process interrupt 1 finish processing interrupt 1 return to code

Proposed behavior if allow equal pil pending interrupt: running code -> interrupt 2 occurs pil set to 0, start processing interrupt 2 -> interrupt 11 occurs pil set to 1, preempt interrupt 2, start processing interrupt 11 -> interrupt 1 occurs pil still set to 1, keep processing 11 interrupt 11 finishes, use mnxti to see if any interrupts are pending, see interrupt 1 level == pil process interrupt 1, finish processing interrupt 2, return to code

if interrupt 2 and interrupt 1 are programmed at the same level, why do you want interrupt 1 to be able to complete before interrupt 2? program interrupt 1 to a higher level if you want that behavior. since they are programmed at the same level, interrupt 2 should be able to finish before tail-chaining to interrupt 1, right?

Thanks, Dan


From: David-Horner [email protected] Sent: Tuesday, September 22, 2020 5:00 PM To: riscv/riscv-fast-interrupt [email protected] Cc: Subscribed [email protected] Subject: Re: [riscv/riscv-fast-interrupt] This is the final version of preemptible interrupt handler of section 7.2 (no issue as such) (#102)

Without the new variant for mnxti to allow equal pil pending interrupt, pil in mcause needs to be reduced by 1. fortunately mcause is in a0 when csrrsi a1,mnxti,1 would be executed, so we can change the new instruction into lui a1,PILbit0 # low bit location of mpil in mcause sub a0,a0,a1 # reduce mpil in mcause by 1 csrrw x0, mcause, a0 # write it back to activate it csrrci a1,mnxti, MIE # using reduced pil continue with interrupts disabled

This is costly. The write to mcause especially needs to complete before mnxti executes. In any case, 3 more instructions including the mcause write back.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_riscv_riscv-2Dfast-2Dinterrupt_issues_102-23issuecomment-2D697026175&d=DwMCaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=R2WIYoRmomuHhrOH4uCU6OMCPHMd0uav2hKfWsl1d_4&m=z-aaWM8lIBavDcbRCUQq9DXZOYDdWxXOU3be9XI-Rc0&s=6ZPAfiE0NwS6CBLuOFlrXce24qGxBG_O6qVshjia7D4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_APFWH2ZXKB3KHBH3BTB5IHTSHEUBTANCNFSM4RL76JIQ&d=DwMCaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=R2WIYoRmomuHhrOH4uCU6OMCPHMd0uav2hKfWsl1d_4&m=z-aaWM8lIBavDcbRCUQq9DXZOYDdWxXOU3be9XI-Rc0&s=aeuDv7oW0GMQVTUttMoo0RfBKr3XrPKfonaYttz7zdw&e=.

dansmathers avatar Sep 23 '20 15:09 dansmathers

On 2020-09-23 11:03 a.m., dansmathers wrote:

Thank you very much for your response.

Sorry for the delay in responding.

Hi David, I'm a little confused why you want to allow equal pil pending interrupt

here is an example: interrupts 1,2, programmed to all be at interrupt level 1 interrupts 11 programmed to be at interrupt level 10 This is not the situation I have in my code, but we can certainly discuss it.

Even in this scenerio, one might want to check for and respond to a  pending interrupt of equal pil.

My understanding of behavior as currently defined: running code -> interrupt 2 occurs pil set to 0, start processing interrupt 2 -> interrupt 11 occurs pil set to 1, preempt interrupt 2, start processing interrupt 11 -> interrupt 1 occurs pil still set to 1, keep processing 11 interrupt 11 finishes, use mnxti to see if any interrupts are pending, none are above level 1 so return to interrupt 2 finish processing interrupt 2, use mnxti to see that interrupt 1 is waiting process interrupt 1 finish processing interrupt 1 return to code looks good to me. A typical scenario.

Proposed behavior if allow equal pil pending interrupt: running code -> interrupt 2 occurs pil set to 0, start processing interrupt 2 -> interrupt 11 occurs pil set to 1, preempt interrupt 2 but only when global interrupts are disabled, otherwise 11 waits.

This allows code to establish a safe return point that does not require mret, and more importantly extensive register restore and mret setup.

, start processing interrupt 11 -> interrupt 1 occurs pil still set to 1, keep processing 11 interrupt 11 finishes, But what does interrupt 11 do when it finishes?

In the case that cares about mnxti, it branches to the safe return point in interrupt 2 code.

Now, pil = level 1

the safe return code can complete interrupt 2 or it can check with the special mnxti and handle the interrupt 1 first.

It has substantial flexibility.

But even if it completes 2 first it still wants to detect interrupt 1 without restoring mcause and doing a mret.

In deed  without restoring mcause at all, but only doing a special mnxti to identify and process any pending same level from there on.

(continuing processing like alternating between 1 and 2)

It is a "minor" optimization, instead mcause could be required to be restored, but that would require a stack read in my code, no longer a minor operation.

(not to mention the lockout/delay from a write to a CSR to its use in an "instruction" like mnxt )

To be explicit, the following is NOT the processing order I envisioned (but certainly possible):

use mnxti to see if any interrupts are pending, see interrupt 1 level == pil process interrupt 1, finish processing interrupt 2,

return to code Yes, eventually we get back to original code, but we make every attempt to catch any pending or coincident interrupt as possible.

Note, my code has interrupts enabled throughout the epilog.

It is especially here that a same level interrupt can occur and set pil to the current level.

It is this scenario I envisioned. Thus, I stipulated in my code preface that these horizontal interrupts are configured at the highest interrupt level in use by the system.

As we see here, that is not absolutely necessary, but any lower priority level for the code would require the higher level interrupt handler to be "cooperative" with the code in the lower level for full benefits to be realized.

if interrupt 2 and interrupt 1 are programmed at the same level, why do you want interrupt 1 to be able to complete before interrupt 2? Good question. It was not what I needed. But software could make that determination,

      perhaps the hardware was set incorrectly at equal (an errata or intentional) and it has to compensate, or

      perhaps the relative priorities change dynamically and software can thus accommodate without reprogramming the CLIC (FLICR? [Fast Local Interrupt CLIC Renamed])

Anyway, it could definitely come in handy.

program interrupt 1 to a higher level if you want that behavior. This is high overhead for dynamically changing priorities, perhaps insufficiently timely. since they are programmed at the same level, interrupt 2 should be able to finish before tail-chaining to interrupt 1, right? Correct.

But even if they were different levels the completion order does not have to be the invocation order:

     As we can see that with interrupt 11; if it occurred first, mnxti would show a pending interrupt 1 or 2 and software could process it before finishing 11 if it so chose.

Thanks, Dan

Thank you!!

Directed questions greatly help me to  flesh/flush out these concepts in/from my head.


From: David-Horner [email protected] Sent: Tuesday, September 22, 2020 5:00 PM To: riscv/riscv-fast-interrupt [email protected] Cc: Subscribed [email protected] Subject: Re: [riscv/riscv-fast-interrupt] This is the final version of preemptible interrupt handler of section 7.2 (no issue as such) (#102)

Without the new variant for mnxti to allow equal pil pending interrupt, pil in mcause needs to be reduced by 1. fortunately mcause is in a0 when csrrsi a1,mnxti,1 would be executed, so we can change the new instruction into lui a1,PILbit0 # low bit location of mpil in mcause sub a0,a0,a1 # reduce mpil in mcause by 1 csrrw x0, mcause, a0 # write it back to activate it csrrci a1,mnxti, MIE # using reduced pil continue with interrupts disabled

This is costly. The write to mcause especially needs to complete before mnxti executes. In any case, 3 more instructions including the mcause write back.

David-Horner avatar Sep 25 '20 14:09 David-Horner

Hi David,
Is your preemptible interrupt handler a special case that you would like included in the spec along with the existing handler? If I'm understanding it right, in your handler, interrupts are only enabled for preemption during the final restore from the stack (epilogue). the rest of the time interrupts are disabled so there won't be preemption of an interrupt while it is doing work (in the example, the interrupt task is just incrementing a counter). So this is designed for short, high priority tasks where the amount of work the high priority interrupts have to accomplish is less than the length of the epilogue? If I'm understanding it right, the horizontal preemption that exists is just used to shorten the epilogue (where all the csr/stack accesses/mret are) and there wouldn't be preemption of the interrupt tasks themselves, right?

maybe this should be added to the beginning of the handler comments? e.g. Inline interrupt handler for rapid tail-chaining of critical high priority interrupts.

dansmathers avatar Oct 06 '20 22:10 dansmathers

@dansmathers

Is your preemptible interrupt handler a special case that you would like included in the spec along with the existing handler?

Yes, if it doesn't supersede that example. Each has it own strengths and constraints. So along side is probably best. It is only a special case in that it takes to the extreme what preemptible means. All handlers will have critical sections. By constraining where horizontal interrupts are allowed to preempt, optimal handling of the high frequency interrupts is possible.,

If I'm understanding it right, in your handler, interrupts are only enabled for preemption during the final restore from the stack (epilogue).

Actually, no. There is a an opportunity to accept another pending interrupt on either side of the core service function (the counter increment and interrupt flag reset) not just epilogue. If we define preemption in terms of nesting, then this code fails that definition. There is no recursive handling. Instead the request are queued, processed in order. The depth in this example is two. Other possibilities with the same idea of bracketing the service code could allow out of order "preemptive servicing", and especially if the service code were partitioned. However, that is more complex, generates more code, increases the stack foot print (due to more registers required for state) and may increase latency. So not as good an example in my thinking. But more importantly, keeping order avoid starvation of equal and lower priority interrupts. Reducing service code overhead compensates for not expediting higher priority handling and also avoids potential drop outs.

the rest of the time interrupts are disabled so there won't be preemption of an interrupt while it is doing work (in the example, the interrupt task is just incrementing a counter). So this is designed for short, high priority tasks where the amount of work the high priority interrupts have to accomplish is

up to here yes ...

less than the length of the epilogue?

No, that is not the defining characteristic. In that the epilogue is fully interruptible and rerunnable its length/duration is not a determinant. This is one of the novel aspects; a generally applicable but not primary objective.

The primary goal is to keep all critical sections (and especially housekeeping tasks) as small as possible to maximize handling bursts of interrupts. A more complex service portion can be subdivided into multiple critical sections, with a corresponding increase in registers needed to contain intermediate state.

The biggest win is the service portion can become the dominant part of the inner process. In this case the code from intrpt1 and including intrpt2 is all service code. In the most contested case, only two more instructions are needed to manage the hart handler housekeeping: the new mnxti variant and the branch if the new interrupt is detected.[Yes, the service portion does reset the interrupt controller, but the hart handler housekeeping, processing/logic state, register save/restore, mret setup, etc. are moved out of that path as much as possible.]

If I'm understanding it right, the horizontal preemption that exists is just used to shorten the epilogue (where all the csr/stack accesses/mret are) and there wouldn't be preemption of the interrupt tasks themselves, right?

As before it depends whether nesting is central to your idea of preemption. The epilogue is indeed preempted in that case, but we don't see the nesting because the rerunable code allows that nesting to be virtual; absorbed in "tail processing". But also mentioned before, there are two other points, before and after the service code, that pending interrupts are allowed.

maybe this should be added to the beginning of the handler comments? e.g. Inline interrupt handler for rapid tail-chaining of critical high priority interrupts.

That is a good summary and I'm happy to include it. However, I believe the opening comments detail what I consider the salient and distinctive features:

  1. The reduced latency with critical sections comparable to the original non-preempt code.
  2. vastly improved service in the heavily contended case.

David-Horner avatar Oct 07 '20 02:10 David-Horner

Just linking this, code assumes #101

kasanovic avatar Jun 22 '21 16:06 kasanovic