gdal
gdal copied to clipboard
WIP: Run mypy on CI to check python type hints
This is exploring adding mypy to the CI so we can check generated python type hints (see #3538).
What does this PR do?
Running mypy on the generated osgeo directory of the python bindings currently yields ~900 warnings. A lot of them seem to be because swig generates annotations containing C/C++ types like:
def DecToPackedDMS(*args) -> "double":
Mypy won't be happy about that annotation though and will generate an error like:
osgeo/gdal.py:5940: error: Name "double" is not defined
The strategy explored in this PR is to add a bunch of Python TypeAlias to the bindings so that mypy is aware that a double is just a python float:
%pythoncode %{
from typing import TypeAlias
void: TypeAlias = None
double: TypeAlias = float
%}
What are related issues/pull requests?
#3538
On SWIG side, there is ongoing discussion on how to allow custom type hints (rather than C types one), but as far as I can tell, this isn't implemented yet: https://github.com/swig/swig/issues/735
Open question
Once we have mypy running on CI, is it better to land this already and then do multiple PR to get rid of some of the errors one by one ? Or would you prefer one big PR directly fixing all the type hints ? I guess the "big PR" approach could be hairy if some of the type hints require a bit more work than just mere aliases.
Tasklist
- [ ] Get initial mypy setup & running on CI
- [ ] Get maintainer decision on whether this is better done in a single PR or multiple small ones
One could argue that for the base types (double, etc.), this should be done by SWIG itself, and thus it would be better to fix that at that level (limiting the manual work to custom typemaps), but I've no idea how hard it is to fix that in SWIG itself
Regarding the incremental vs big PR approach, I've no idea of the amount of changes, so up to you and your intutition on what approach works best.
Thanks for the quick reply !
One could argue that for the base types (double, etc.), this should be done by SWIG itself, and thus it would be better to fix that at that level (limiting the manual work to custom typemaps), but I've no idea how hard it is to fix that in SWIG itself
Agreed - I can explore this in parallel and get in touch with the SWIG project to see if something can be done there.
Regarding the incremental vs big PR approach, I've no idea of the amount of changes, so up to you and your intutition on what approach works best.
Then I think I'll start fixing some warnings and split in multiple PR if I see it gets out of hand.
@julienr IMO one big PR should be fine if:
- each commit fixes one logical change (ie: one commit for
doubleeverywhere, another commit forvoideverywhere, etc) so it can be reviewed by-commit. - anything non-obvious/tricky/complex or that requires a description has it's own commit
- bugs you find along the way have their own commits
- you're happy to rebase changes during reviews to keep the logical commits sensible.
But likewise splitting it up into multiple PRs (easy -> hard?) would work too 😄
After playing with the problem a bit more, it may be indeed a good idea to try to improve SWIG to generate proper type hints. On some functions that are using multiple typemaps, it can generate quite confusing type hints between what's in the docstring and the inline hints:
def GetWellKnownGeogCSAsWKT(*args) -> "char **":
r"""GetWellKnownGeogCSAsWKT(char const * name) -> OGRErr"""
return _osr.GetWellKnownGeogCSAsWKT(*args)
I think what's happening here is that this function can return either an OGRErr (on failure) or a string, so the proper type would be OGRErr | str or something like this.
I don't think that can all be solved with aliases, so it's either fixing SWIG or writing some scripts to parse & replace the generated python files - which sound brittle.
closing that one as staled. Please reopen if progress is made
@rouault I'm glad work on this has stopped. The Python bindings make it more efficient to write tests for GDAL, for sure, but I don't think we should otherwise be trying to make them super modern and fancy and enterprisey.