unpythonic
unpythonic copied to clipboard
Pythonify TAGBODY/GO from Common Lisp
See Peter Seibel: Practical Common Lisp, Chapter 20 for an explanation.
Rough draft of how we could pythonify this. User interface:
from unpythonic.syntax import macros, with_tags, tag, go
@with_tags # <-- decorator macro
def myfunc(): # <-- just a regular function definition
x = 42
tag["foo"] # tag[...] forms must appear at the top level of myfunc
x += 1
if x < 10:
go["foo"] # go[...] forms may appear anywhere lexically inside myfunc
return "whatever"
In a with_tags section, a tag[...] form at the top level of the function definition creates a label. The go[...] form jumps to the given label, and may appear anywhere lexically inside the with_tags section. To stay within Python semantics (following the principle of least surprise), myfunc otherwise behaves like a regular function.
Possible macro output:
# actually no explicit import; just use `hq[]` in the macro implementation.
from unpythonic import trampolined, jump
def myfunc(): # may take args and kwargs; we pass them by closure
@trampolined
def body(): # gensym'd function name
nonlocal x
x = 42
return jump(foo)
def foo(): # use the tag name as the function name
nonlocal x
x += 1
if x < 10:
return jump(foo)
return "whatever"
return body()
x = None # never reached; just for scoping
Essentially, this is another instance of lambda, the ultimate goto.
Notes:
- Good performance, since no need to use exceptions for control.
- Only
body(the entry point, called by the expandedmyfunc) needs to be trampolined, since none of the inner functions are accessible from the outside (their names being local tomyfunc). - Use scope analysis (see
unpythonic.syntax.scoping) to determine which variable names are local tomyfuncin the input. Then scope those tomyfunc(by assigning aNoneat the end), and declare themnonlocalin the helper functions, so that the top level ofmyfuncforms just one scope (emulating Python's scoping rules), though the macro splits it into helper functions.- Beware any names in an existing inner
defor comprehension form - those should stay local to that innerdef. Only top-level locals matter here. The scoping utility should already take this into account (stopping name collection at scope boundaries).
- Beware any names in an existing inner
- A top-level
returnfrom one of the helper functions will shut down the trampoline, and return that value from the TCO chain. This results in a top-level return frommyfunc, which is exactly what we want. So we don't have to transform top-levelreturnstatements when we generate the expansion. - Tags should be collected in an initial pass over the body.
- Tag names must be unique within the same
with_tagssection (enforce this in the syntax transformer!), so we can use the tag names as the names of the helper functions. This is also informative in stack traces. - Make
tag[]andgo[]raise an error at runtime if used outside anywith_tags. Usemacro_stub.
Things to consider:
- How to support lexical nesting of
with_tagssections? Jumping to a tag in a lexically outerwith_tagsis the complex part. Possible solution below in a separate comment.- The current draft doesn't work for that, since what we want is to shut down the inner trampoline, and give the
jump(...)to the outer trampoline. - Wrap the inside of
myfuncin an exception handler, and makego[]to an undefined label raise an exception with instructions where to jump?- An unresolved
go[]can be allowed to unwind the call stack, because nesting is lexical - when unresolved locally (i.e. by the nearest enclosingwith_tags), ago[]can only refer to tags that appear further out. - The handler can check if this
with_tagssection has that label, jump to it if it does, and else re-raise. The exception can have anargsmessage complaining aboutgo[]to a label not defined in any lexically enclosingwith_tagssection, if uncaught. - We must generate code to store the labels for runtime access to support this. A simple assignment to a set is fine.
- We must then trampoline all of the helper functions, because this kind of jump may enter any of them. But that's no issue - the TCO mechanism already knows how to strip away unwanted trampolines (if tail-calling into a function that has its own trampoline).
- This solution still doesn't allow us to stash a closure containing a
go[]and call it later, after the original function has exited. Short of using general continuations, is that even possible? Does CL handle this case? If so, how?- CL indeed doesn't. See footnote 7 of the linked chapter of Seibel: "Likewise, trying to GO to a TAGBODY that no longer exists will cause an error." So it shouldn't matter if we don't, either.
- An unresolved
- The current draft doesn't work for that, since what we want is to shut down the inner trampoline, and give the
- Interaction with
with continuations? As always, this is the tricky part. - Position in xmas tree? At least
prefixandautoreturnshould go first. - Strings or bare names as input to
tag[...]andgo[...]? A string avoids upsetting IDEs with "undefined" names, but a bare name without quotes is shorter to type. - This should be orthogonal from
with lazify, since the helper functions take no arguments. Ifmyfunchas parameters, we can allowwith lazifyto process them as usual. with_tagssections should expand from inside out (so any inner ones are resolved first), so this is a second-pass macro.- Implies TCO. Leave an AST marker (see
ContinuationsMarkerfor an example), and modifywith tcoto leave alone anywith_tagssections it encounters.
Updated draft for macro output, supporting nested with_tags sections:
# actually no explicit import; just use `hq[]` in the macro implementation.
from unpythonic import trampolined, jump
# define in unpythonic.syntax.tagbody; unhygienic_expose to have it available at the macro use site
class JumpToTag(Exception):
def __init_(self, tag):
self.tag = tag
# uncaught error message
self.args = ("go[] to a label '{}' not defined in any lexically enclosing with_tags section".format(tag),)
# macro output
def myfunc(): # may take args and kwargs; we pass them by closure
# define our set of tags (capture these in the syntax transformer)
our_tags = {"foo"} # gensym'd variable name on LHS
# top-level code of myfunc, split into helper functions
# Each must be trampolined, because a go[] from an inner with_tags
# (if two or more are nested) may jump into any of them.
@trampolined
def body(): # gensym'd function name for part before first tag
nonlocal x
x = 42
return jump(foo)
@trampolined
def foo(): # use the tag name as the function name
nonlocal x
x += 1
if x < 10:
return jump(foo)
return "whatever"
# runner harness
# Copy locals to have a guaranteed-frozen copy of the current state,
# so we retain references to the helper functions even if the user code
# overwrites those names.
our_funcs = dict(locals()) # gensym'd variable name on LHS
f = body # gensym'd variable name on LHS
while True:
try:
return f()
# catch nonlocal jumps from inner nested with_tags sections
except JumpToTag as jmp:
if jmp.tag in our_tags:
f = our_funcs[jmp.tag]
else: # not ours, let the jump propagate further out
raise
# never reached; just for scoping top-level locals of myfunc
x = None
-
Note the
bodyhelper function should be omitted when the first item in the body is a tag.Or, from the opposite viewpoint, insert an implicit tag
body(with a gensym'd name) at the start when the first item is not a tag. -
We can refactor the runner harness into a function in
unpythonic.syntax.tagbodyand just call it in ahq[]. -
When the user code says
go["bar"]:- When
"bar"is ours, and thego[]appears at the top level ofmyfunc, transform intoreturn jump(bar). - In all other cases, transform into
raise JumpToTag("bar"), jumping by name (possibly unwinding several levels until the matching tag is found). - We know at compile time which tags are defined at each level so we always have the information to decide this.
- A simpler implementation could always use the exception mechanism. What the two-level implementation buys us is speed for the common case, local jumps within the same level.
- When
-
May need to adjust the uncaught error message for
JumpToTag. There are two distinct cases where an uncaughtJumpToTagmay occur:- Attempting to jump to a tag that is not defined in any lexically enclosing
with_tagsform. - Attempting to jump to any tag after the
with_tagsform has exited (if theraise JumpToTag("bar")was stashed in a closure, to be used later).- ~Actually, if it's a local jump,
return jump(bar)is even more dangerous, as it won't even emit an error. Should we always use exceptions to perform jumps, for simplicity and robustness?~ Not a problem, becausereturn jump(bar)can only work at the top level ofmyfuncanyway.
- ~Actually, if it's a local jump,
- Reaching the
exceptblock proves that at least onewith_tagsform that is still alive (whether or not related to the one that should actually be handling this particulargo[]) has seen theJumpToTaginstance. Could call a method to update the uncaught message.- Default:
"go[]: attempted jump to '{}' outside the dynamic extent of any with_tags form".format(tag) - Updated:
"go[]: attempted jump to '{}', tag not found in any dynamically enclosing with_tags form".format(tag) - The wording should make it clear that the lookup occurs by dynamic extent, not actually lexically, since this can cause bugs in case of a stashed
go[].
- Default:
- Attempting to jump to a tag that is not defined in any lexically enclosing
-
Regarding continuations: inspecting the code I wrote about a year ago (tailtools.py), this just might work without touching the implementation of the
continuationsmacro, as long as@with_tagsexpands first.Tail calls are edited by the continuation machinery to pass on the
ccargument, so the continuation should be unaffected by jumps to tags inside the original function (i.e., really, tail calls to closures). Theccparameter is added automatically to each of them, since they're just regular functions after the@with_tagsform has expanded away. So the value ofccis relayed all the way until the tag-using function exits (from any of the closures it contains), at which point the continuation machinery tail-calls the continuation.Obviously, this needs a test. It probably either works on the first try, or induces a lot of pain. :)
-
Maybe modify the TCO macro to analyze whether functions and tail calls it encounters are already TCO'd, so that we don't need to leave an AST marker for an expanded
@with_tags. (That's starting to look like the simpler option.)
Meh, maybe we should just go for the previous idea, using exceptions. That gives better semantics. Nesting feels pretty important and a goto should act like a goto (even if it can only jump within the same level or outward).
- Hmm. Since the whole point is to have a lexical construct, we don't need the exception machinery for treating nested
with_tagsblocks. We canreturn jumpjust like for the local level, since the names will be in scope!- But then, in case of nested
with_tagssections, a jump to an outer tag (from an inner block) will be processed by the inner trampoline, and the outer trampoline will remain alive. Is this a problem? - This depends on how we want to define the semantics of nesting
with_tagssections.- Processing the jump in the inner trampoline means that jumping to an outer label jumps there within the inner call. Once the call returns, execution resumes normally. Can't exit an outer block by a
go[]from an inner block. - If we kill the inner trampoline and let the outer trampoline handle the jump, then a
go[]acts as a goto - the inner call is cancelled, as is the rest of the caller, and execution resumes at the tag.
- Processing the jump in the inner trampoline means that jumping to an outer label jumps there within the inner call. Once the call returns, execution resumes normally. Can't exit an outer block by a
- To simplify, we could just disallow nesting
with_tagssections, too. But that's bad for copy'n'pasting snippets that may use the construct.
- But then, in case of nested
- This actually also allows invoking a
gostashed in a closure after the dynamic extent of the original function has ended, so it's also a form of continuations.return lambda: go["fish"]becomesreturn lambda: jump(fish), and the lambda retains the state of any local variables in the closure. So it effectively pauses execution.- On the other hand, that
jumpis useless after the trampoline has exited. - On the third hand, just give it a new trampoline.
f = trampolined(lambda: jump(fish)); f()should work.- We could package that as a utility function
resume(thunk), which takes a thunk, trampolines it and calls the result, returning its return value. This utility should probably be placed in thetcomodule. - Or even better, package
resumable(target, *a, **kw), which can be used instead ofjump(target, *a, **kw), when the intent is to delay the jump, and just return a thunk that jumps to the target when called. It could create and returntrampolined(lambda: jump(target, *a, **kw)). Then this object can be called normally to start the new trampoline and perform the jump. A jump object by itself is inert data anyway, so there's no need to have thelambda: ...at the call site.- Maybe return a
lambda *a2, **kw2: ...instead of justlambda: ..., to allow specifying additional args and adding/overriding kwargs.
- Maybe return a
- Do we need even that? We could just return a reference to the closure. It has a trampoline set up already.
- When manually used, that's perfect. This isn't a
go[], though, so forTAGBODY/GOwe need another name for this semantic.golater[],plantogo[],goable[],resumable[],prepare[],prime[],resume_at[],resumer[]?
- We could package that as a utility function
- So this is actually a simpler way to have multiply enterable continuations in Python?! The limitation is we can't easily feed values into the continuation, and it doesn't do the "parent continuation" thing, but if we want to take this in that direction, we could design a way to pass arguments in
go[]before we implement this.
- On the other hand, that
I had hoped to get this into 0.14.3, but the technical cleanup has already produced enough material for a minor update. Better to push that out first, and then concentrate on these additions.