astroid icon indicating copy to clipboard operation
astroid copied to clipboard

Add brain to fix tensorflow import errors.

Open hmc-cs-mdrissi opened this issue 2 years ago • 19 comments

Summary

Fix the import false positives with tensorflow. I was guided by this comment, although the implementation is rather different as that comment was for much older tensorflow version.

Type of Changes

Type
:bug: Bug fix

Description

Closes pylint issue.

I've manually tested fix on tensorflow 2.5, 2.8, and 2.12 latest for tensorflow.estimator which gives import-error without this fix, but imports fine with it. 2.5 was released about 2 years ago (May 2021) and is newest version to support python 3.9. If you go further back eventually this solution may not work as packaging/import magic has evolved for tensorflow. However even tf 2.5 is already too old to be compatible with pylint. tf 2.5 has upper bound on typing extensions <4 making it depenency conflict to try to install it + pylint. So I think working that far back is long enough. Also as this only adds a fallback for import-errors even if user installs a very old tf version like 1.10 and ignores dependency conflicts this will not worsen their experience.

Most tensorflow imports go through tensorflow.python so this adds fallback to try that. You can see that happening here.

Testing

How should I test this? Should I? It's fairly short simple brain. If it's fine to add tensorflow as a dependency to requirements_full similar to numpy I can add it there and try to add a test case similar to the ones found in test/brain.

hmc-cs-mdrissi avatar May 09 '23 07:05 hmc-cs-mdrissi

I have not had a look at this PR, but was looking into the brains recently for performance implications.

What about putting this into a pylint-tensorflow plugin? The issue with these brains is that they are loaded for all users of pylint even if they are not using tensorflow. That will obviously be a considerable group. This is very much out of the scope of your PR @hmc-cs-mdrissi but I think we should consider this before adding more brains for stuff outside of the stdlib.

DanielNoord avatar May 09 '23 07:05 DanielNoord

Codecov Report

Merging #2174 (7a7a582) into main (351c8de) will decrease coverage by 0.02%. The diff coverage is 80.00%.

:exclamation: Current head 7a7a582 differs from pull request most recent head 0411505. Consider uploading reports for the commit 0411505 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2174      +/-   ##
==========================================
- Coverage   92.60%   92.59%   -0.02%     
==========================================
  Files          94       95       +1     
  Lines       10800    10810      +10     
==========================================
+ Hits        10001    10009       +8     
- Misses        799      801       +2     
Flag Coverage Δ
linux 92.34% <80.00%> (-0.02%) :arrow_down:
pypy 87.53% <80.00%> (-0.01%) :arrow_down:
windows 92.18% <80.00%> (-0.02%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
astroid/brain/brain_tensorflow.py 80.00% <80.00%> (ø)

codecov[bot] avatar May 09 '23 07:05 codecov[bot]

Hmm I'm fine with that. There is no pylint-tensorflow package today though so it would need to be created/uploaded to pypi first. It's easier on my side to maintain this one file vs add a repository/package to maintain pylint-tensorflow (probably skipping setting up CI/related things). I'd also guess visibility of pylint-tensorflow will be low relative to tensorflow and most users won't install it. If you decide (entirely fair) to want this not in astroid proper I'm fine closing this pr.

I think this particular brain is about as cheap performance wise as can be though. It adds one import-error fallback that checks for exact module name of tensorflow and exits if any other name.

edit: One other idea is could astroid have extras and there be astroid[tensorflow] with small astroid-tensorflow package here?

hmc-cs-mdrissi avatar May 09 '23 07:05 hmc-cs-mdrissi

The issue with these brains is that they are loaded for all users of pylint even if they are not using tensorflow.

It's easier on my side to maintain this one file vs add a repository/package to maintain pylint-tensorflow (probably skipping setting up CI/related things).

Great points. I did not check specifically how much time was taken in startup to load brain but everything count and adds up on each invocation of pylint. We probably need to create a pylint-dev/pylint-x project for each of our brains if we want plugin to be a reasonable solution for contributors and get better performance overall. This could start with a cookie cutter template for brain-plugins, then a migration of our brains to their own project. We also need to warn users that they will need to install those plugins in the future. Seems like the path forward long term tbh, but it's not something that is fast and doable overnight.

So meanwhile we should merge brains like we always did so astroid is still useful for tensorflow users, right ?

Pierre-Sassoulas avatar May 09 '23 10:05 Pierre-Sassoulas

I think my main open question left is recommendations for testing this. Is it fine to add tensorflow here for full test dependencies and then add a few test cases checking it works for imports/still returns an import error for a fake module?

hmc-cs-mdrissi avatar May 09 '23 17:05 hmc-cs-mdrissi

For any other lib I would say to add it like we did for numpy, urllib, etc. But tensorflow is like 400 mb to download, right ?

Pierre-Sassoulas avatar May 09 '23 18:05 Pierre-Sassoulas

Yes. It is a large wheel and not an ideal dependency for 1 test case.

I'm fine maintaining it without test here as I will use this for my own internal codebase (~30k lines) where pylint is enforced in CI and will run on a lot of tensorflow imports.

hmc-cs-mdrissi avatar May 09 '23 18:05 hmc-cs-mdrissi

I'm considering creating a proof of concept project in pylint-dev for the tensorflow plugin (we need autoregistry it when the package is installed ? I'm not sure of the specific, but there will be some rough edges to smooth). A lot of person could benefit from your contribution. In short if we don't do it for tensorflow we're never going to start.

Pierre-Sassoulas avatar May 09 '23 18:05 Pierre-Sassoulas

I am fine with that. If you make a initial repository pylint-tensorflow even if minimal at first (manual pypi release + copy over setup.cfg settings) I'll move this pr there and help with initial setup files.

hmc-cs-mdrissi avatar May 09 '23 18:05 hmc-cs-mdrissi

So I created a repository (https://github.com/pylint-dev/pylint-tensorflow) added you as maintainer and took the namespace in pypi (https://pypi.org/project/pylint-tensorflow/). This is still very rudimentary but let's start small and add what we need when we need it (release not operational at the moment I need to add a pypi API key). There's a ton of thing that could be done better like using orgs label, and probably a million other thing, I'm thinking of using this to see what need to be done to make pylint plugin maintenance easier in the long term.

Pierre-Sassoulas avatar May 09 '23 19:05 Pierre-Sassoulas

Just a user of pylint and tensorflow here. I just ran into the problem with pylint throwing import-errors as reported. This discussion looks like you work on a solution for that. However, I also ran into the import-error problem for tensorflow-serving-api-package. Maybe the plugin also could incorporate the handle of those false positives as well. Would be great if there is a solution available somewhat covering the landscape of tf-packages.

However, thanks a lot for tackling this. This definitely will make it easier to write clean tensorflow code.

crzdg avatar May 10 '23 08:05 crzdg

We can now release in https://github.com/pylint-dev/pylint-tensorflow. I wonder if this MR should add an extra require of pylint-tensorflow that would register the proper transform automatically when the package is available. The issue of course is that it introduce a circular import. Maybe it's better if users have to install pylint-tensorflow on their own and pylint-tensorflow depends on astroid not vice versa. Or we could consider that astroid -> pylint-tensorflow -> astroid[tensorflow] is not a circular dependency.

Let's let pylint-tensorflow depend on pylint, that fixes the astroid dependency automatically and couples the plugin to what it refers to.

DanielNoord avatar May 10 '23 11:05 DanielNoord

Let's let pylint-tensorflow depend on pylint, that fixes the astroid dependency automatically and couples the plugin to what it refers to.

That would mean astroid can't be used for tensorflow parsing alone without a dependency to pylint (simply to get the tensorflow brain). I think astroid -> pylint-tensorflow -> astroid[tensorflow] or having to add pylint-tensorflow explicitly is more acceptable than that.

Pierre-Sassoulas avatar May 10 '23 14:05 Pierre-Sassoulas

Let's let pylint-tensorflow depend on pylint, that fixes the astroid dependency automatically and couples the plugin to what it refers to.

That would mean astroid can't be used for tensorflow parsing alone without a dependency to pylint (simply to get the tensorflow brain). I think astroid -> pylint-tensorflow -> astroid[tensorflow] or having to add pylint-tensorflow explicitly is more acceptable than that.

Why would you ever want only astroid and pylint-tensorflow? Shouldn't it be astroid-tensorflow then?

DanielNoord avatar May 10 '23 14:05 DanielNoord

Hmm, right, we might want to mix astroid-x and pylint-x for simplicity's sake.

Pierre-Sassoulas avatar May 10 '23 15:05 Pierre-Sassoulas

Hmm, right, we might want to mix astroid-x and pylint-x for simplicity's sake.

I think it is very likely that people will also start asking about ignoring specific messages for specific constructs after we have made such a package. I think one cookie cutter way of doing these plugins is probably best. I don't see a lot of value of using astroid on its own, especially in combination with a plugin, so I think it is fine to have all plugins be couple to pylint. If we ever see a need for astroid specific plugins we can fix that then.

DanielNoord avatar May 10 '23 15:05 DanielNoord

OK so this MR should be closed in favor of something in pylint-tensorflow, and user should know they have to install pylint-tensorflow (so probably needs a doc MR in pylint once there's actually something in pylint-tensorflow), right ?

Pierre-Sassoulas avatar May 10 '23 15:05 Pierre-Sassoulas

OK so this MR should be closed in favor of something in pylint-tensorflow, and user should know they have to install pylint-tensorflow (so probably needs a doc MR in pylint once there's actually something in pylint-tensorflow), right ?

Yes! We can keep a list of "supported" pylint plugins somewhere in our docs.

DanielNoord avatar May 10 '23 15:05 DanielNoord

I’m following discussion and will try to close and reopen the pr in new repo today. May take me a day or two though depending on how busy I am with work.

hmc-cs-mdrissi avatar May 10 '23 16:05 hmc-cs-mdrissi