bytecode
bytecode copied to clipboard
Creating functions through bytecode
I'm trying to create a new function / lambda purely through bytecode (no actual Python code). I can't really find an example on how to do this.
The bytecode I want to generate:
import bytecode.tests
import dis
dis.dis(bytecode.tests.get_code("def some_fn(x): return x"))
Which prints out:
1 0 LOAD_CONST 0 (<code object some_fn at 0x7fbe893c6b30, file "<string>", line 1>)
2 LOAD_CONST 1 ('some_fn')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (some_fn)
8 LOAD_CONST 2 (None)
10 RETURN_VALUE
Disassembly of <code object some_fn at 0x7fbe893c6b30, file "<string>", line 1>:
1 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
Here's how I'm trying to create the function:
from bytecode import ConcreteBytecode, ConcreteInstr
# Define the body of the function
bytecode_fn = ConcreteBytecode()
bytecode_fn.varnames = ["x"]
bytecode_fn.extend([ConcreteInstr("LOAD_FAST", 0), # var x
ConcreteInstr("RETURN_VALUE")])
# Convert bytecode_fn to code
fn_code_obj = bytecode_fn.to_code()
bytecode = ConcreteBytecode()
bytecode.names = ["some_fn"]
bytecode.consts = [fn_code_obj, "some_fn", None]
bytecode.extend([ConcreteInstr("LOAD_CONST", 2), # Default x: None
ConcreteInstr("LOAD_CONST", 0), # fn_code_obj
ConcreteInstr("LOAD_CONST", 1), # "some_fn"
ConcreteInstr("MAKE_FUNCTION", 1), # 1 arg
ConcreteInstr("STORE_NAME", 0), # "some_fn"
ConcreteInstr("LOAD_CONST", 2), # None
ConcreteInstr("RETURN_VALUE")])
# Execute bytecode
code = bytecode.to_code()
exec(code)
# Call created function
# Error:
#
# TypeError: <module>() takes from -16 to 0 positional arguments but 1 was given
some_fn(1)
# Error:
#
# UnboundLocalError: local variable 'x' referenced before assignment
some_fn()
This code created the some_fn function, but with zero arguments. I believe MAKE_FUNCTION requires you to add a default value on TOS (which in my case is None).
How do I adjust this to actually create a function that accepts 1 argument and binds it to x? If we can figure this out I'd be willing to make a write up and place it in the documentation.
Figured it out. I had to add an argcount which defaults to 0.
from bytecode import ConcreteBytecode, ConcreteInstr
bytecode_fn = ConcreteBytecode()
bytecode_fn.varnames = ["x"]
bytecode_fn.consts = [10]
bytecode_fn.argcount = 1
bytecode_fn.extend([ConcreteInstr("LOAD_FAST", 0), # var x
ConcreteInstr("LOAD_CONST", 0),
ConcreteInstr("BINARY_ADD"), # Add 10
ConcreteInstr("RETURN_VALUE")])
fn_code_obj = bytecode_fn.to_code()
bytecode = ConcreteBytecode()
bytecode.names = ["some_fn"]
bytecode.consts = [fn_code_obj, "some_fn", None]
bytecode.extend([ConcreteInstr("LOAD_CONST", 0), # fn_code_obj
ConcreteInstr("LOAD_CONST", 1), # "some_fn"
ConcreteInstr("MAKE_FUNCTION", 0),
ConcreteInstr("STORE_NAME", 0), # "some_fn"
ConcreteInstr("LOAD_CONST", 2), # None
ConcreteInstr("RETURN_VALUE")])
# Execute bytecode
code = bytecode.to_code()
exec(code)
some_fn(1) # Returns 11
I guess I overlooked the fact that ConcreteBytecode inherits from BaseBytecode. Would you like this (or similar "howto create a function") in the docs? If so I can make a PR and otherwise this issue can just be closed.
This could be an interesting example to have in the docs. Feel free to make a PR.
Just as a note that one can also create functions with FunctionType: https://github.com/P403n1x87/surgery/blob/792748fb7497d8e5d0998a54231fc16742250c55/surgery.py#L56