lean4
lean4 copied to clipboard
feat: labeled break/continue
This is an implementation of labeled break and continue in do notation.
example : IO Unit := do
for (label := xLoop) x in [1:10] do
for (label := yLoop) y in [1:10] do
for z in [1:10] do
if (x, y, z) = (5, 6, 7) then
break (label := xLoop)
else if (x, y, z) = (7, 8, 9) then
continue (label := yLoop)
It's a relatively straighforward extension to the existing do notation macro, although there is a lot of repeated substructure in the definitions due to the way all the DoResult* types work.
A summary:
-
The syntax of
for,while,repeat,break, andcontinueis extended with an optional(label := l)argument. (This is a somewhat heavyweight syntax, but it's used rarely enough that I don't think it will be an issue.) -
There are three main contexts where do sequences can appear: at the top level (
.regular), inside a for loop (.forIn), and inside a monadic combinator (.nested*).-
In
.regularmode there is nothing to do since no loop labels can be in scope. -
In
.forIn, there used to be two cases: either the loop contains areturnor not. Now, the loop may also contain abreak/continueto an outer loop, which mostly acts the same asreturnfrom the point of view of the inner loop. Because of the same type inference issues that resulted in theDoResulttypes, there are now fourforIncases:.forIn: only break and continue targeting the current loop, noreturn.forInR(was.forInWithReturn): only break and continue targeting the current loop, butreturnis present.forInBC: break and continue may target outer loops, noreturn.forInRBC: break and continue may target outer loops,returnis present
Each case has a different return type:
forIn:m VarsforInR:m (Option ReturnType × Vars)forInBC:m (Option BC × Vars)forInRBC:m (Option (BC ⊕ ReturnType) × Vars)
Here
Option (BC ⊕ ReturnType)represents the type of "places to go after exiting the loop".- The
noneoption means a normal exit, no jump - The
ReturnTypeoption means areturn val, exit to the top level BC.continue (n : Nat)means to continue a loopnlabels above this one. That is, labeled loops are effectively de bruijn indexed by the type. Unlabeled loops are skipped in the numbering, andcontinue 0means to continue the loop just outside the one we just exited.BC.break (n : Nat)similarly means to break from then'th surrounding loop.
Actually,
Option BCis pre-baked into the typeDoStateBC, andOption (BC ⊕ ReturnType)is baked into the typeDoStateRBC ReturnTypeto avoid unnecessary allocations. This does lead to a lot of repetition in the pattern matches though. -
The third category is
.nested, where we have to pack up all the state, including the "pure" value to return to the surrounding context. These are largely unchanged, but theDoResult(S)BC.{break, continue}got aNatfield encoding the loop index to jump to. Note that index0here corresponds to unlabeled break (jump to the immediately surrounding loop), whilen+1corresponds to the outer loops.
-
-
When a labeled break is propagating up the stack, we are mostly only decrementing the loop label index as we exit the loops. But since this is a macro, it has to generate syntax which continues the propagation, and we don't want to explode a match with all
ncases to say "match result with | .break 0 => `(break (label := yLoop)) | .break 1 => `(break (label := xLoop))" and so on for all the loops, since they are mostly just going to be packed back into integers in the next macro expansion. So to make this more efficient, we havebreakIdx% n. This will break to then'th labeled loop, not counting the current one (even if it has a label). Then we only have two cases in the match:.break 0 => breakand.break (i+1) => breakIdx% i. When the current loop is not labeled this simplifies to just.break i => breakIdx% i.
There are still some things left to make this more than an MVP:
- Add more documentation
- Add syntax highlighting / go-to-definition for loop labels
- unused variable linter for loop labels
- Check that it's not a performance hit. (For definitions not using labeled loops, the desugaring is almost unchanged, except that the
DoResultPRBCandDoResultSBCtypes have gotten a little larger, with an extra contstructor argument which is always set to 0 and ignored in the match.)
I suspect this will be a difficult PR to review, so let me know if there is anything I can do to help.