jedi
jedi copied to clipboard
Completions fail with KeyError in parser_cache if fast_parser is False
The autocompletion jedi.Script#complete
fails with because the parso.cache.parser_cache
is empty when setting jedi.settings.fast_parser = False
.
Try this script to reproduce the issue:
import jedi
jedi.settings.fast_parser = False
script = jedi.Script(
code="im", # Just starting a new file. Trying to get 'import' completed
path="", # Using "" for unsaved files as described in the docstring
)
# Complete at the end of the code
completions = script.complete()
print(completions)
Traceback:
Traceback (most recent call last):
File "/home/benjamin/misc/jedi/./jedi-18-1-bug.py", line 12, in <module>
completions = script.complete()
File "/home/benjamin/misc/jedi/jedi/api/helpers.py", line 487, in wrapper
return func(self, line, column, *args, **kwargs)
File "/home/benjamin/misc/jedi/jedi/api/__init__.py", line 214, in complete
return completion.complete()
File "/home/benjamin/misc/jedi/jedi/api/completion.py", line 170, in complete
cached_name, completion_names = self._complete_python(leaf)
File "/home/benjamin/misc/jedi/jedi/api/completion.py", line 314, in _complete_python
completion_names += self._complete_global_scope()
File "/home/benjamin/misc/jedi/jedi/api/completion.py", line 379, in _complete_global_scope
for filter in filters:
File "/home/benjamin/misc/jedi/jedi/inference/context.py", line 486, in get_global_filters
yield from context.get_filters(
File "/home/benjamin/misc/jedi/jedi/inference/context.py", line 318, in get_filters
next(filters, None)
File "/home/benjamin/misc/jedi/jedi/inference/value/module.py", line 63, in get_filters
ParserTreeFilter(
File "/home/benjamin/misc/jedi/jedi/inference/filters.py", line 138, in __init__
super().__init__(parent_context, node_context)
File "/home/benjamin/misc/jedi/jedi/inference/filters.py", line 100, in __init__
self._parso_cache_node = get_parso_cache_node(
File "/home/benjamin/misc/jedi/jedi/parser_utils.py", line 287, in get_parso_cache_node
return parser_cache[grammar._hashed][path]
KeyError: '715ad56c5f4f8395092c58b6b6f2deb4f906f81380929a836bd86ab253634875'
Using git bisect
I found that the first commit with this bug is https://github.com/davidhalter/jedi/commit/b9067ccdbbd308e96c5384765d1743f2da062d0e.
This might also be related to https://github.com/davidhalter/jedi/commit/db0e90763be0f65de1a03a270f10272b46184892 and the comment in this commit. The issue didn't happen when I used None
for the path
parameter of jedi.Script
. However, I am not sure if this is the solution or just a workaround.
Using None
there will trigger other issues. This looks like a bug.
Hi, is there any progress on this?
I'm using parso
to compare and analyze different versions of code, so I need to either turn off fast_parser
or deepcopy
the entire parso
tree. Due to this bug, now I have to use deepcopy
but it's about 3 times slower than just turning off fast_parser
. Is there any simple fix to this cache error (or a faster alternative to using deepcopy
?) Thanks!
Hi there! We hit this issue on an internal code analysis project that uses Jedi internally. We worked around it by patching jedi/inference/filters.py
to cleanly handle the KeyError
, but this feels like a workaround that doesn't really solve the underlying issue.
@davidhalter If you want, I can submit a PR with our patch, but I'm also willing to implement a more correct fix if you have a suggestion, so it can be upstreamed?
@mithaler I'm interested to find out what you're using Jedi for internally. You can also email me if you don't want to say it in public.
I understand that your patch fixes this issue, but at this point nobody understands why this happens (I don't either). So we somehow would need to understand why this happens, because if we don't we risk triggering a ton of other bugs.
I'm interested to find out what you're using Jedi for internally.
Sure, I can sort of answer this: we want to detangle a large monorepo, so we're generating dependency trees at a function level. This requires us to recursively look up definitions of functions, so we're using a language server, in this case Jedi's.
I understand that your patch fixes this issue, but at this point nobody understands why this happens (I don't either). So we somehow would need to understand why this happens, because if we don't we risk triggering a ton of other bugs.
While I can't provide a test case here (it would involve revealing the internal source code we're analyzing), I can say that this only happens when we delve deeply into the dependency tree, and it happens unpredictably (that is, I see errors on different parts of the tree on different runs on the same code). I'm not entirely sure how Parso's cache works, but I suspect weak references being GC'ed are a possible culprit?
@mithaler If you're using a language server, are you sure fast_parser is False? That's not how almost everyone uses Jedi. The original issue is not very surprising to me, because almost nobody uses that code path AFAIK, so I'm not surprised there are bugs around caching (because it's not really well tested and likely an easy fixable bug once someone reads a bit of code).
I'm not sure of that -- I just know that the stack trace we see is very similar (though not identical), and my commit, linked above, appears to fix it.
@davidhalter Regarding what @mithaler mentions here ie I can say that this only happens when we delve deeply into the dependency tree, and it happens unpredictably (that is, I see errors on different parts of the tree on different runs on the same code).
., I also notice the exactly same thing.
@davidhalter Can you explain the purpose/implications of setting fast_parser
as True or False ? At the moment, I care less about speed and more about correctness, so in that case, would it be better to set fast_parser
to False ?
Setting fast_parser to False was mostly used by people that had issues with the incremental parsing of parso. This used to be a bit buggier, but now that I fuzzed it a lot it's very stable and has no known bugs. So disabling it is not interesting regarding correctness. It's just interesting if you wanted to get rid of exceptions, which is not really the case anymore.
But it's actually a good question, because the original poster might just use the fast_parser
default. It's of course still a bug.