executing icon indicating copy to clipboard operation
executing copied to clipboard

Sentinel

Open 15r10nk opened this issue 2 years ago • 12 comments

different fixes for the SentinelNodeFinder

currently work in progress.

The question if this are required when we stop supporting python < 3.8

#64 is a new node finder which could be used for the same versions.

15r10nk avatar Sep 10 '23 13:09 15r10nk

hi @alexmojaki, I'm currently working on the 3.13 branch but found some issues with the SentinelNode finder which cause failing tests. The test failures are random in the ci because we test a random subset of the cpython source code.

I fixed some of theme but I have now one problem which I don't know to solve.

The last wip commit of this branch contains a small_sample which reproduces this bug.

(dict(((k.lower(), v) for (k, v) in self.itermerged())) == (k.lower for (k, v) in something))

you can analyse it with:

python tests/analyse.py tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10
155fa1.py 0:20

Bildschirmfoto 2024-07-19 07:29:41

I hope you can help fixing it or understand why this bytecode can not be found, in which case we would need some code in test_main.py:TestFiles.check_filename which accepts this bug as known issue.

15r10nk avatar Jul 19 '24 05:07 15r10nk

This bug can be reproduced with python 3.8.3

15r10nk avatar Jul 19 '24 05:07 15r10nk

Can you give a traceback for the error?

alexmojaki avatar Jul 19 '24 10:07 alexmojaki

----------------------------------------- Captured stdout call -----------------------------------------
check /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baf
fa6104bcc10155fa1.py
----------------------------------------- Captured stderr call -----------------------------------------
node without associated Bytecode
in file: /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0
baffa6104bcc10155fa1.py
correct: False
len(values): 0
values: []
deadcode: False

ast node:
lower
Attribute(value=Name(id='k', ctx=Load()), attr='lower', ctx=Load())
parents: [<_ast.GeneratorExp object at 0x7f7e5f952bb0>, <_ast.Compare object at 0x7f7e5f952220>, <_ast.E
xpr object at 0x7f7e5f952460>, <_ast.Module object at 0x7f7e6010f550>]
node range: lineno=2 end_lineno=2 col_offset=60 end_col_offset=67
source code:
   1: 
   2: (dict(((k.lower(), v) for (k, v) in self.itermerged())) == (k.lower for (k, v) in something))

======================================= short test summary info ========================================
FAILED tests/test_main.py::test_small_samples[471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10
155fa1.py] - AssertionError

The execption is raised inside the test, because it can not find bytecode for the node. It looks for me like the bytecode node mapping fails and this node is missing a bytecode now. But I have no idea what the problem with this code sample could be.

15r10nk avatar Jul 19 '24 18:07 15r10nk

I meant a traceback for the exception Expected one value, found 2. I think the problem is that both code objects look similar (name, lineno, variable names) so it fails there before even looking at instructions.

alexmojaki avatar Jul 19 '24 18:07 alexmojaki

here is the backtrace where the NotOneValueFound exception is raised:

>       TestFiles().check_filename(full_filename, check_names=True)

tests/test_main.py:749: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/test_main.py:882: in check_filename
    result = list(self.check_code(code, nodes, decorators, check_names=check_names))
tests/test_main.py:1461: in check_code
    for x in self.check_code(inst.argval, nodes, decorators, check_names=check_names):
tests/test_main.py:1237: in check_code
    ex = Source.executing(frame)
executing/executing.py:238: in executing
    node_finder = NodeFinder(frame, stmts, tree, lasti, source)
executing/executing.py:589: in __init__
    matching = list(self.matching_nodes(exprs))
executing/executing.py:658: in matching_nodes
    original_instructions = self.get_original_clean_instructions()
executing/executing.py:647: in get_original_clean_instructions
    for inst in self.compile_instructions()
executing/executing.py:742: in compile_instructions
    code = only(self.find_codes(module_code))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

it = [<code object <genexpr> at 0x7fe1e69bda80, file "/home/frank/projects/executing/tests/small_samples
/471ff7c2daa37eded7...k/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25
fd0baffa6104bcc10155fa1.py", line 2>]

    def only(it):
        # type: (Iterable[T]) -> T
        if isinstance(it, Sized):
            if len(it) != 1:
>               raise NotOneValueFound('Expected one value, found %s' % len(it),it)
E               executing.executing.NotOneValueFound: Expected one value, found 2

executing/executing.py:71: NotOneValueFound
----------------------------------------- Captured stdout call -----------------------------------------
check /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baf
fa6104bcc10155fa1.py
mapping failed
Expected one value, found 2
value: <code object <genexpr> at 0x7fe1e69bda80, file "/home/frank/projects/executing/tests/small_sample
s/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10155fa1.py", line 2>
value: <code object <genexpr> at 0x7fe1e69bd7c0, file "/home/frank/projects/executing/tests/small_sample
s/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10155fa1.py", line 2>
search bytecode Instruction(opname='LOAD_FAST', opcode=124, arg=1, argval='k', argrepr='k', offset=10, s
tarts_line=None, is_jump_target=False)
in file /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0b
affa6104bcc10155fa1.py

You can see the code objects in the screenshot above where I analysed them. But LOAD_ATTR(lower) and LOAD_METHOD(lower) looks different for me.

15r10nk avatar Jul 19 '24 20:07 15r10nk

After making AST modifications with the sentinel and compiling that, it needs to find which code object contains the changes. So it looks for a code object that looks similar to the code object of the original frame. The instructions will necessarily contain some differences so it can't use those to match. You can see the heuristics used in find_codes.

Now that I think about it, it could probably just look for a code object which has the sentinel string in co_consts. But maybe there's a reason that doesn't work. But even if that doesn't always work by itself, it would almost certainly work in combination with the existing strategy.

Otherwise, this is a known limitation that goes back all the way to the beginning. See https://github.com/alexmojaki/executing/blob/6a54f9890d7f438b82647b42da2f3834f864f34f/tests/test_main.py#L201. It's a very specific and rare edge case. In your example it happens because there's two <genexpr> codes on the same line, containing the same variable names (self and something aren't in those frames). Change k to k2 in one of them, or put a newline between them, and the problem should go away. This has probably never happened before in test_sys_modules, and if it's not easy to apply the suggested fix above then it's fine to just skip the file/module where it happens based on its name.

alexmojaki avatar Jul 19 '24 20:07 alexmojaki

Thank you for the explanation. I skipped the module tests for now in the 3.13 branch. I will look at it later again.

15r10nk avatar Jul 19 '24 21:07 15r10nk

Ah, but compile_instructions is called in two places. One of those places is where "look for a code object which has the sentinel string in co_consts" makes sense. The other place is in your traceback, before the AST is modified:

https://github.com/alexmojaki/executing/blob/6a54f9890d7f438b82647b42da2f3834f864f34f/executing/executing.py#L669-L684

But it only needs to do that if there are any JUMP_IF_NOT_DEBUG instructions in result, which is probably quite rare. If it checked that before calling compile_instructions then that would usually avoid the error at that line. It would also probably make everything significantly faster in general since compiling is expensive.

alexmojaki avatar Jul 19 '24 21:07 alexmojaki

I added the fix that you proposed, but I get now an exception at a different place:

tests/test_main.py:881: in check_filename
    result = list(self.check_code(code, nodes, decorators, check_names=check_names))
tests/test_main.py:1460: in check_code
    for x in self.check_code(inst.argval, nodes, decorators, check_names=check_names):
tests/test_main.py:1236: in check_code
    ex = Source.executing(frame)
executing/executing.py:238: in executing
    node_finder = NodeFinder(frame, stmts, tree, lasti, source)
executing/executing.py:589: in __init__
    matching = list(self.matching_nodes(exprs))
executing/executing.py:674: in matching_nodes
    instructions = self.compile_instructions()
executing/executing.py:740: in compile_instructions
    code = only(self.find_codes(module_code))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

it = [<code object <listcomp> at 0x7f473b5540e0, file "/home/frank/projects/executing/tests/small_sample
s/d8987afe0f74653bc...k/projects/executing/tests/small_samples/d8987afe0f74653bcddfe56e18c616dbd52d934cd
ff78644c5221935b35186f2.py", line 3>]

    def only(it):
        # type: (Iterable[T]) -> T
        if isinstance(it, Sized):
            if len(it) != 1:
>               raise NotOneValueFound('Expected one value, found %s' % len(it),it)
E               executing.executing.NotOneValueFound: Expected one value, found 2

executing/executing.py:71: NotOneValueFound
----------------------------------------- Captured stdout call -----------------------------------------
check /home/frank/projects/executing/tests/small_samples/d8987afe0f74653bcddfe56e18c616dbd52d934cdff7864
4c5221935b35186f2.py
mapping failed
Expected one value, found 2
value: <code object <listcomp> at 0x7f473b5540e0, file "/home/frank/projects/executing/tests/small_sampl
es/d8987afe0f74653bcddfe56e18c616dbd52d934cdff78644c5221935b35186f2.py", line 3>
value: <code object <listcomp> at 0x7f473b554500, file "/home/frank/projects/executing/tests/small_sampl
es/d8987afe0f74653bcddfe56e18c616dbd52d934cdff78644c5221935b35186f2.py", line 3>
search bytecode Instruction(opname='LOAD_GLOBAL', opcode=116, arg=0, argval='len', argrepr='len', offset
=8, starts_line=None, is_jump_target=False)
in file /home/frank/projects/executing/tests/small_samples/d8987afe0f74653bcddfe56e18c616dbd52d934cdff78
644c5221935b35186f2.py

15r10nk avatar Jul 23 '24 04:07 15r10nk

Now that I think about it, it could probably just look for a code object which has the sentinel string in co_consts. But maybe there's a reason that doesn't work. But even if that doesn't always work by itself, it would almost certainly work in combination with the existing strategy.

I will try this

15r10nk avatar Jul 23 '24 04:07 15r10nk

OK, I tried this for a while, it doesn't really work. I could maybe make it work in your case where the instructions are clearly different, but it doesn't seem worth it. Let's just skip this specific edge case.

alexmojaki avatar Aug 10 '24 15:08 alexmojaki