Add `tree_rev` attribute to `TreeRebuilder`
Description
First step for https://github.com/PyCQA/pylint/issues/5466#issuecomment-1033211018
Although it should work like I describe below, I'm certainly open for ideas how to make it better. Especially with regards to the "global" state for tree_rev. Let me know what you think @Pierre-Sassoulas @DanielNoord.
Important changes
- A new
tree_revattribute is added toTreeRebuilder. Usually a value should be passed with__init__, but it defaults toTREE_REVdefined inconst.py. - Each call to
TreeRebuilder.visitrequires one more attribute ->tree_rev. This might seem redundant, since the value is already stored as instance variableself.tree_rev. Unfortunately, it isn't possible to defined overloads based on instance variables though. The goal is to be able say:"For tree_rev = 1, visit of AST node x returns instance of A, for tree_rev = 2, it returns instance of B."Fortunately for us, to archive that it is not necessary to add the argument to eachvisit_xxxfunction. - A new
tree_revclass variable is added toAstroidManager. It's basically used to store a "global" state for the selected revision. See below on how it should be use. - A
tree_revattribute has been added toparse. Passing it would temporarily overrideAstroidManager.parse. Note: This is only intended for experimenting. In production the whole tree should be generated with only one revision.
Rational
We are at a point were multiple issues require updates to the TreeRebuilder. At the moment, it isn't really possible to do them in a backwards compatible manner. With this change, libraries like pylint which use astroid can choose how they would want the Python AST to be parsed by astroid. Thus we will enable a more granular upgrade path. In the end that allows us to iterate on tree changes before bundling them all together into a new major version.
Intended use
Libraries may either choose to stick with the default revision provided by astroid or override the class variable AstroidManager.tree_rev. Changing the revision should be done before any astroid AST is build. For pylint the required change would be right after the imports in pylinter.py.
AstroidManager.tree_rev = 2
Impact for plugins
Each update to the tree should essentially be consider a breaking change for all plugins. pylint probably more so than astroid once. That's just the nature of it, unfortunately. The stability policy this PR uses only extends to libraries which generate the AST themselves (eg. pylint).
After a new revision is release, each plugin should be updated to make sure it's compatible. It's important to remember that plugins don't choose the revision themselves. Meaning if they support a large enough version range with multiple revisions, they need to make sure all work as expected. That can, for example, be done with revision guards by accessing AstroidManager.tree_rev. (Only important to remember not to do that on import as the version might not have been set yet. (?) )
That section also applies to astroids builtin brain modules and as a whole. We need to make sure astroid works with all currently supported revisions, regardless of the tree output.
Open Challenges
- A bit surprisingly, although the brain modules are imported before the
AstroidManagerand thus before the global revision is set, almost all actions are only done later during inference. All, except for one which needs to be fixed separately. https://github.com/PyCQA/astroid/blob/1cf4bf7f3f4491ddda76662a6c57206d10d990b0/astroid/brain/brain_builtin_inference.py#L142-L147 Instead of on import, the method should be run as part of_post_buildbefore any transforms are called. https://github.com/PyCQA/astroid/blob/1cf4bf7f3f4491ddda76662a6c57206d10d990b0/astroid/builder.py#L173-L175 The are existing testes for that, so it should be fairly easy to make sure it still works as intended. - As
astroidwill end up supporting multiple revision at the same time, we also need to make sure to test the whole test suite against each one. That will end up multiplying the time spend on these. In cases where different revisions change unrelated parts, it might be possible to do one tests for the before and after instead of each individually. The good new, that doesn't apply to pylint as it should only ever support one at a time.
Open Questions
-
How do we want to handle revision changes in pylint?
Increasing the major version feels wrong, since this isn't directly a user facing change. However as written currently, some things might as well break. So just a minor version bump might be misleading as well.
Just an idea, we could require plugins to specific their supported revisions (lower / upper bound). If the one used by pylint is outside the supported range, the plugin is disabled with a warning. If no bounds are set we would assume the current revision 1. With that in place, updates wouldn't be breaking once any longer and thus it could be possible to include them in
minorreleases as long as we make sure to highlight them in the release notes. The downside here is that we'll probably end up introducing fragmentation into the plugin ecosystem. - ???
Todos
- [ ] Changelog entry
- [ ] Documentation
- [ ] Rewrite reference change to the
TreeRebuilderusing that change. Test impact on pylint.
Type of Changes
| Type | |
|---|---|
| ✓ | :sparkles: New feature |
You have probably thought about this, but:
Don't we unnecessarily complicate things with this change. As you said before, astroid's main goals is to be an ast parser with extras for pylint. That was your own reason to reject some recent PRs I believe.
How many actively maintained and used projects actually using astroid and are unable to upgrade to a possible astroid 3.x. And if these projects exist: how likely are they to be used with projects that do upgrade to 3.x? Because as far as I can see, that's where the real problems start: a user downloading two packages one requiring 2.x and the other 3.x.
Do we know if there are actually projects that would be unable to upgrade to 3.x? And if so, can we perhaps help them with their upgrades?
I'm asking this as 1) I don't have a good solution for the pylint versioning problem yet which can become a deal-breaker here, 2) we would be committing a lot of time to backwards compatibility for non pylint-packages and 3) I think pylint is likely to want a 3.x version after the migration to argparse is complete as well. We could bundle that with astroid 3.x.
pylint 3.x would not feature many user-facing changes, but I think there are some breaking changes brewing for plugins, that would fit nicely with the theme of a potential astroid 3.x and we could write up a transitioning guide to help people move.
Don't we unnecessarily complicate things with this change.
Yes and not. The idea was to do astroid 3.x after all major changes are done. Just so we keep the maintenance low. The current challenge is that we need more time to prepare all of them. We are nowhere ready for 3.x yet.
After your comment I again thought about if there are options to do these changes backwards compatible. This would certainly be better. We'll have to see what's possible, but I'm already planing to do some initial PRs with this change before moving forward.
As for pylint 3.x, you're right that bundling it with the breaking changes in astroid would make sense. But even then, we need some sensible way to upgrade the pylint code itself. If we bundle multiple breaking changes into one release, we would need to fix all at once to be able to upgrade. With tree_rev, we could do these in steps.
Yes and not. The idea was to do
astroid 3.xafter all major changes are done. Just so we keep the maintenance low. The current challenge is that we need more time to prepare all of them. We are nowhere ready for3.xyet.
👍 Agreed.
After your comment I again thought about if there are options to do these changes backwards compatible. This would certainly be better. We'll have to see what's possible, but I'm already planing to do some initial PRs with this change before moving forward.
As for
pylint 3.x, you're right that bundling it with the breaking changes in astroid would make sense. But even then, we need some sensible way to upgrade thepylintcode itself. If we bundle multiple breaking changes into one release, we would need to fix all at once to be able to upgrade. Withtree_rev, we could do these in steps.
What about adding an option/global variable to pylint similar to tree_rev? We could support both versions in the codebase like we would do in astroid?
What about adding an option/global variable to
pylintsimilar totree_rev? We could support both versions in the codebase like we would do inastroid?
Not sure how practical that would be. I've created the first PR for the Try node change #1389. If you want, give it a go. So far I've only run the pylint functional tests and got 40+ errors. Haven't looked at what code would need to be updated though.
Next up will be the name node change for ClassDef and FunctionDef. I was thinking about, instead of replacing the name attribute, adding a new one name_node. It isn't perfect, but at least it would be backwards compatible and we could start using it today.
I reverted passing self.tree_rev to each self.visit call. After looking at #1389, it became clear that the benefit of slightly improved typing wouldn't be worth the additional effort required.
I'll leave this PR as draft for now. Will probably come back it it once I have much more time (in 2-3 months).
Pull Request Test Coverage Report for Build 2319748031
- 14 of 14 (100.0%) changed or added relevant lines in 4 files are covered.
- No unchanged relevant lines lost coverage.
- Overall coverage increased (+0.006%) to 91.723%
| Totals | |
|---|---|
| Change from base Build 2319296328: | 0.006% |
| Covered Lines: | 9187 |
| Relevant Lines: | 10016 |
💛 - Coveralls
Sorry I'm a little late to this discussion. This seems like something really impactful in term of maintenance required. As far as I know astroid is not used in any major lib except pylint. For a small script or a pre-commit hooks the astroid version can be set to an old one easily, or upgraded easily because there isn't that much code. So we're not making anyone life miserable by making breaking change in astroid without handling retro-compatibility, unless I'm mistaken. If we would actually make someone life miserable, and if adding rev_tree in astroid is easier than upgrading their library, then the persons affected could do this rev_tree change themself directly in astroid with their own requirements in mind so it would probably fit their need better than what we imagine they would need.
As for us we control the astroid version in pylint so we can use astroid > 3 and upgrade pylint's code when we do upgrade ? Basically my argument is that YAGNI.
I broke the __pkginfo__ in pylint when I upgraded the packaging to use setuptools / setup.cfg. I could have anticipated everything and kept all the existing API at great cost for maintaining it (with everything duplicated between setup.cfg and pylint/__pkginfo__.py) but I did not and once it was released it turns our only numversion was important or used, so I added it back. and some package maintainers set their version of pylint explicitely in their tests. final results it that pylint 2.8.1, is not a package that is working well with pylama or prospector but it's not that much of a problem.
@Pierre-Sassoulas You're right the proposal would complicate maintenance and doesn't even solve all issues. I'm a bit glad we didn't do it 😅
The big issue with the proposal is that even for pylint it wouldn't be all that useful. Yes we could have revisions for the ast tree, but every update we do needs to be complete form the start since we can only select distinct versions. Incremental updates would be better.
If I think back, the change from doc to doc_node worked quite well. Emulating that for future updates might be a far better way forward. The only "issue" is that we can't directly replace nodes so we need to work around that a bit.
This is what I was thinking of now for the Try node #1389. I plan to update the PR soon.
- Add a new attribute to the existing
TryExceptandTryFinallynodes which exposes the newTrynode, e.g.try_node. With that pylint can still access the "old" nodes like it's doing now. However, we can also start to update the code one part at a time. - Add deprecation warnings to all attributes and methods from the old
TryExceptandTryFinallynodes (except the newtry_nodeattribute). We can add a decorator and use__getattribute__for that so it shouldn't be too complicated. - For astroid
3.0we removeTryExceptandTryFinallyand insert theTrynode directly in the tree. For backwards compatibility, we add atry_nodeattribute (toTry) which just returnsself. - In pylint, we now need to remove the
try_nodeattribute which should be strait forward since the code itself already uses the new node.node.try_node.xxx->node.xxx - Lastly, we deprecate the
try_nodeattribute and remove it in astroid4.0.
There might be some edge cases especially around get_children and nodes_of_class but those should be manageable.
What do you think?
The deprecation cycle sounds about right, we already have a lot of deprecation like this going on for pylint 3 and astroid 3. We're going to have to release 3.0 at some point, and 4.0 seems very far at this point but if there's no easy way to make it in one step we don't have much of a choice.
Superseded by #1867