astroid
astroid copied to clipboard
Improve consistency of JoinedStr inference
This PR:
- improves consistency of JoinedStr inference by always returning either Uninferable or a fully inferred Const
- optionally, using a flag on JoinedStr, allows returning a partially inferred string such as "a/{MISSING_VALUE}/b" when inferring f"a/{missing}/b" and missing cannot be inferred.
Type of Changes
| ✓ | 🐛 Bug fix | ✓ | :sparkles: New feature |
Description
See above
Closes #2621
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 93.21%. Comparing base (
e380fd1) to head (4e9680c). Report is 48 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #2622 +/- ##
==========================================
- Coverage 93.24% 93.21% -0.04%
==========================================
Files 93 93
Lines 11060 11076 +16
==========================================
+ Hits 10313 10324 +11
- Misses 747 752 +5
| Flag | Coverage Δ | |
|---|---|---|
| linux | 93.10% <100.00%> (-0.04%) |
:arrow_down: |
| pypy | 93.21% <100.00%> (-0.04%) |
:arrow_down: |
| windows | 93.21% <100.00%> (-0.04%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Files with missing lines | Coverage Δ | |
|---|---|---|
| astroid/nodes/node_classes.py | 94.93% <100.00%> (-0.21%) |
:arrow_down: |
I don't really understand your use case: can't we just return
Uninferablewhen we can't infer the full string? That seems consistent with other nodes and is also the suggestion I already made when inference ofJoinedStrwas first added.
Returning Uninferable is not helpful in some usage scenarios.
Consider the following example:
def method_with_arg(arg):
spark.table(f"/mnt/root/dir/{arg}/logs")
We want to tell our users that reading spark tables from mounts is no longer supported.
With JoinedStr.inferred() returning Uninferable, we're unable to detect that.
The feature proposed in this PR makes it possible, optionally, to return "/mnt/root/dir/{MISSING_VALUE}/logs", which is what we need to detect the "/mnt/" prefix, and also provide better feedback to users.
Happy to place the flag elsewhere.
The feature proposed in this PR makes it possible, optionally, to return "/mnt/root/dir/{MISSING_VALUE}/logs", which is what we need to detect the "/mnt/" prefix, and also provide better feedback to users.
I guess l'm wondering why you would detect that by calling inferred(), wouldn't you just directly inspect the AST?
>>> from astroid import extract_node
>>> n = extract_node("""
... def method_with_arg(arg):
... spark.table(f"/mnt/root/dir/{arg}/logs")
... """)
>>> from astroid import nodes
>>> js = list(n.nodes_of_class(nodes.JoinedStr))[0]
>>> js.values[0].value
'/mnt/root/dir/'
I'll comment further on the issue.
I guess l'm wondering why you would detect that by calling inferred(), wouldn't you just directly inspect the AST?
Oh, I see, it's so that both cases (mnt injected by the formatted value) can be handled by one call. I guess I'm not sure astroid needs to support that.
Personally I feel like this use case is best served with a dedicated method, or an implementation in the pylint plugin itself. There just seem to be too many edge cases where the behaviour of the flag would be hard to define.
I think .infer() should just have one behaviour which should be both easily explainable and matching with what we do for other node types. Any specific JoinedStr behaviour should probably not be part of a generic interface.
The feature proposed in this PR makes it possible, optionally, to return "/mnt/root/dir/{MISSING_VALUE}/logs", which is what we need to detect the "/mnt/" prefix, and also provide better feedback to users.
I guess l'm wondering why you would detect that by calling
inferred(), wouldn't you just directly inspect the AST?>>> from astroid import extract_node >>> n = extract_node(""" ... def method_with_arg(arg): ... spark.table(f"/mnt/root/dir/{arg}/logs") ... """) >>> from astroid import nodes >>> js = list(n.nodes_of_class(nodes.JoinedStr))[0] >>> js.values[0].value '/mnt/root/dir/'I'll comment further on the issue.
Your proposal is not workable with the following code:
def method_with_arg(arg):
path = f"/mnt/root/dir/{arg}/logs"
spark.table(path)
not speaking of even more complex scenarios:
def method_with_arg(arg):
paths = [f"/mnt/root/dir/{arg}/logs", "/some/other/path"]
tables = [spark.table(path) for path in paths]
That's what makes node.inferred() so valuable!
Personally I feel like this use case is best served with a dedicated method, or an implementation in the
pylintplugin itself. There just seem to be too many edge cases where the behaviour of the flag would be hard to define.I think
.infer()should just have one behaviour which should be both easily explainable and matching with what we do for other node types. Any specificJoinedStrbehaviour should probably not be part of a generic interface.
We're using astroid directly so a pylint plugin is not an option.
I agree with the above philosophy in theory, but I disagree in practice because it stops me from getting the outcome I need from astroid. We're using it for inference (we could do parsing and transforms using ast)
I guess it's legitimate for users to want to customise behaviour a bit, depending on their requirements.
A better design would be to provide a replaceable NodeFactory, such that users can provide their own implementation of JoinedStr. Happy to provide such PR if it's welcome.
I don't think I want the flag to be a thing on the class.
astroidis not really configurable and I think that is actually beneficial in terms of maintainability.I don't really understand your use case: can't we just return
Uninferablewhen we can't infer the full string? That seems consistent with other nodes and is also the suggestion I already made when inference ofJoinedStrwas first added.
TBH this PR is not proposing to change the current behaviour (see FAIL_ON_UNINFERABLE = True), rather it makes it possible to customise it if required.
@jacobtylerwalls @DanielNoord @Pierre-Sassoulas can you opine on the factory proposal ?
I think .infer() should just have one behaviour which should be both easily explainable and matching with what we do for other node types. Any specific JoinedStr behaviour should probably not be part of a generic interface.
Agree. But if we know it's a string we can return a string even if it's not 'perfect'.
Still lightly in favor of inferring string the best we can even if we have to include 'Uninferrable' in the middle
(/mnt/{Uninferable}/...). But it should be safeguarded, i.e. :
f"/mnt{uninferrable}/..."=> we infer a str :/mnt/{Uninferable}/...- "/mnt" + uninferrable + "/..." => there's no cast to string for uninferrable so we might have a crash and we need to still detect it
I guess I'm not certain we even know it's a string. Don't these uninferable cases include failures to format?
Exactly, or they can include memory overflow issues, or calls to functions that crash or exit the interpreter.
I don't really see why we would special case this infer. If we infer an import we also know that it will be a module but we don't insert dummy module nodes if we can't infer the import.
Let's remove the flag and return fully inferred const or uninferable then.
That doesn't solve my problem. Any comments on the node factory proposal ?
I think without seeing the code to do this we can't really comment on it.
Bigger refactors of astroid have sadly failed before so I'm wary of rewriting how we construct nodes.
Can't you solve your use cases with a function that takes these nodes? Potentially by making that a method in astroid.
Although you're not writing a pylint plug-in your use case does seem to imply you know what kind of node you're dealing with before calling infer.
I think without seeing the code to do this we can't really comment on it. Bigger refactors of
astroidhave sadly failed before so I'm wary of rewriting how we construct nodes.Can't you solve your use cases with a function that takes these nodes? Potentially by making that a method in
astroid. Although you're not writing apylintplug-in your use case does seem to imply you know what kind of node you're dealing with before callinginfer.
This was just an example. We have dozens of use cases where a partial value is preferable to Uninferable, and the node type could be anything (Name, Attribute).
Re node factory, I'm not looking for pre-approval of a PR, rather I'm not interested in investing time providing one if philosophical concerns make it a no-go. I guess the above means 'possibly ok' ?
Could you provide more? I'm still leaning towards infer() should be correct or Uninferable but I'm definitely open to suggestions. I just find it hard to wrap my head around how we could introduce this behaviour change without breaking expectations everywhere.
I'm wondering if try_to_infer() would be a nice extension of the Protocol.
"Possibly ok" is correct in the sense that we never reject proposals by default. I must warn though that maintainers time on astroid is very limited so grand refactors of the codebase might be hard to pull off.
A node factory would transfer the responsibility of partial inference behavior to astroid users (via a dedicated class deriving from JoinedStr) rather than have it with the astroid team. From there the builtin JoinedStr.infer() would indeed be either correct or Uninferable, but astroid users would be able to override the behavior at no cost for astroid and astroid team.
I suspect the refactoring would be small in essence (impacting mostly the parser) but may span minor changes across many files.
I'm still not convinced the effort is worth the result, especially considering the alternatives seems much nicer to me:
- Make this part of the package you are working on to get full control
- Add a 'partial infer' method that has the behaviour you're looking for
I'm still not convinced the effort is worth the result, especially considering the alternatives seems much nicer to me:
- Make this part of the package you are working on to get full control
- Add a 'partial infer' method that has the behavior you're looking for
Not sure what you mean by 1. ? Can you explain ? The last thing we want to do is fork astroid...
Re partial_infer, such a method would have to be available on all node types since the JoinedStr may not be immediately reachable, see for example:
def print_stuff(a):
b = f"Hello {a}!"
print(b)
In the above, b is a Name, not a JoinedStr, so we'd need a partial_infer method on Name, that would call the partial_infer method of its value... Not sure it's nicer...
A node factory has the immense benefit of allowing all sorts of overrides, not just this one...
Looks like this PR is not going o make it, despit having addressed the comments, and narrowed the scope to bug fixing ?
Please wait a little before falling into despair, we're volunteers with limited time, no fast reviews do not mean it's going to be rejected only that we did not find the time to review yet ;)
Re the factory proposal, as far as I understand, we have the ability to do that already:
>>> import astroid
>>> def my_transform(node):
... def _infer_with_values(self, context=None, **kwargs):
... 1/0 # replace with your override
... node._infer_with_values = _infer_with_values
... return node
...
>>> astroid.MANAGER.register_transform(astroid.nodes.JoinedStr, my_transform)
>>> js = astroid.extract_node("f'{missing}'")
>>> js.inferred()
Traceback (most recent call last):
File "<python-input-4>", line 1, in <module>
js.inferred()
~~~~~~~~~~~^^
File "/Users/jwalls/astroid/astroid/nodes/node_ng.py", line 585, in inferred
return list(self.infer())
File "/Users/jwalls/astroid/astroid/nodes/node_ng.py", line 167, in infer
for i, result in enumerate(self._infer(context=context, **kwargs)):
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jwalls/astroid/astroid/nodes/node_classes.py", line 4785, in _infer
yield from self._infer_with_values(context)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "<python-input-1>", line 3, in _infer_with_values
1/0
~^~
ZeroDivisionError: division by zero
Agree with Jacob here, thanks for taking in the review and changing the PR. I don't have time for a review: @jacobtylerwalls would you be okay with merging this?
@ericvergnaud We would first need to rebase to prevent the conflicts.
Agree with Jacob here, thanks for taking in the review and changing the PR. I don't have time for a review: @jacobtylerwalls would you be okay with merging this?
@ericvergnaud We would first need to rebase to prevent the conflicts.
The PR says there are no conflicts, so not sure why rebasing would be needed ?
Thank you for you work and persistence on this @ericvergnaud !