klayout
klayout copied to clipboard
WIP: Added python stubs with type hinting and documentation.
This is an attempt to help address #1092 and improve the API usability. These changes enable static type checking and hinting while using an IDE like VSCode with Pylance.
https://user-images.githubusercontent.com/1088659/180698692-dbfff118-55bb-479a-a036-d02a9c977746.mov
Details:
The documentation was extracted by inspecting the docstrings within each class and methods. This should enable type hinting and checking by IDEs like VSCode. The stubs were automatically generated, and have not been manually curated.
NOTE: This is a work in progress. I recommend manually checking nothing is broken and that no tests broke down.
See dev notes.
TODO:
- [x] 1. Complete signature generation for all methods in pyaModule.cc. A line containing func(arg1, arg2) should suffice. Multiple lines automatically generate overloading.
- [x] 2. Patch stubgen so that it can create docstrings. (Or wait for issue 11965 to be resolved)
- [x] 3. Generate stubs for klayout.*core.
- [ ] 4. Manually check and backannotate types in klayout.*core.pyi.
- [ ] 5. How to update the stubs if klayout code changes? Manually for now.
This looks great, thank you Thomas!
Many thanks for this PR!
Very good you noticed and fixed these default value issues.
About the automatic update I can add a qmake target. I'd prefer the (patched) stubgen script to be part of KLayout's scripts, so it does not need to be installed separately. However, the license of mypy is MIT which not compatible with GPL, so I guess there is a problem with that.
But there is another solution:
Actually, KLayout Python objects support reflection through a private (i.e. not publicly documented) interface in the "tl" module for pymod or "pya" module for the application.
For example, this script will print all classes and methods that are registered:
# Use "tl" for pymod instead of "pya"
import pya
for c in pya.Class.each_class():
base = "object"
if c.base():
base = c.base().name()
print(c.module() + "." + c.name() + " (" + base + "):")
for m in c.each_method():
print(" " + m.name())
The private classes involved are:
-
Class
: represents one registered class -
Method
: represents a method -
MethodOverload
represents a method variant -
ArgType
represents a method argument
You won't find these classes in the public documentation, but you'll see them using "help" (e.g. help(pya.Class)
).
The information provided by these objects should be good enough to emulate stubgen. I feel this is fairly simple and finally implements a straightforward data path instead of parsing doc strings. I think this could be easily added to the build script as a step between compiling the C code and collecting the files for the pymod installation.
I suggest you'll proceed for now with the stubgen approach and I can assist later in generating compatible information from the built-in reflection API.
Best regards,
Matthias
Thanks Matthias. I did not know about this functionality of the tl module. Live and learn, this software is so huge! There were a few more issues regarding default args I had identified but I followed the principle of minimal change.
Before I go further, I don't see the use of mypy is incompatible with klayout's GPL. I think it is compatible. But even if it weren't, we only need the output of the script, namely the .pyi
stub files (see the new files in the commit). So I think it's perfectly reasonable to install mypy only during the automation that creates the wheels. mypy shouldn't become a dependency in my opinion.
Secondly, I'm not sure we want to automate the generation of the stub files. I think that they are not 100% correct and can be improved upon manually, though I haven't gone through them in detail yet. For example, we can manually annotate the db.Trans.M0
etc. as an enumerate instead of a generic Any
. It would be nice to capture updates to the docstrings though.
Here's what I was able to draft from your example:
import pya # initialize all modules
import klayout.tl as ktl
for c in ktl.Class.each_class():
base = "object"
if c.base():
base = c.base().name()
if 'Path' not in c.name():
continue
print(c.module() + "." + c.name() + " (" + base + "):")
for m in c.each_method():
args = ["self"] + [f"{a.name()}: {a.to_s()}" for a in m.each_argument()]
print(" " + m.name() + "(" + ", ".join(args) +") -> " + m.ret_type().to_s())
This is the result:
db.Path (object):
new|#from_dpath(self, dpath: const DPath &) -> new Path *
to_dtype(self, dbu: double) -> DPath
transformed(self, t: const ICplxTrans &) -> Path
round_corners(self, radius: double, npoints: int) -> Path
new(self) -> new Path *
new|#new_pw(self, pts: const Point[] &, width: int) -> new Path *
new|#new_pwx(self, pts: const Point[] &, width: int, bgn_ext: int, end_ext: int) -> new Path *
new|#new_pwxr(self, pts: const Point[] &, width: int, bgn_ext: int, end_ext: int, round: bool) -> new Path *
<(self, p: const Path &) -> bool
\=\=(self, p: const Path &) -> bool
!\=(self, p: const Path &) -> bool
hash(self) -> unsigned long
points=(self, p: const Point[] &) -> void
each_point(self) -> Point
num_points|#points(self) -> unsigned long
width=(self, w: int) -> void
width(self) -> int
bgn_ext=(self, ext: int) -> void
bgn_ext(self) -> int
end_ext=(self, ext: int) -> void
end_ext(self) -> int
round=(self, round_ends_flag: bool) -> void
is_round?(self) -> bool
\*(self, f: double) -> Path
move(self, p: const Vector &) -> Path &
move(self, dx: int, dy: int) -> Path &
moved(self, p: const Vector &) -> Path
moved(self, dx: int, dy: int) -> Path
transformed(self, t: const Trans &) -> Path
transformed|#transformed_cplx(self, t: const CplxTrans &) -> DPath
from_s(self, s: string) -> new Path *
to_s(self) -> string
simple_polygon(self) -> SimplePolygon
polygon(self) -> Polygon
perimeter(self) -> unsigned long long
area(self) -> long long
length(self) -> unsigned int
bbox(self) -> Box
_unmanage(self) -> void
_manage(self) -> void
_create|#create(self) -> void
_destroy|#destroy(self) -> void
dup(self) -> new Path *
assign(self, other: const Path &) -> void
_destroyed?|#destroyed?(self) -> bool
_is_const_object?|#is_const_object?(self) -> bool
db.DPath (object):
new|#from_ipath(self, path: const Path &) -> new DPath *
to_itype(self, dbu: double) -> Path
round_corners(self, radius: double, npoints: int, accuracy: double) -> DPath
transformed(self, t: const VCplxTrans &) -> Path
new(self) -> new DPath *
new|#new_pw(self, pts: const DPoint[] &, width: double) -> new DPath *
new|#new_pwx(self, pts: const DPoint[] &, width: double, bgn_ext: double, end_ext: double) -> new DPath *
new|#new_pwxr(self, pts: const DPoint[] &, width: double, bgn_ext: double, end_ext: double, round: bool) -> new DPath *
<(self, p: const DPath &) -> bool
\=\=(self, p: const DPath &) -> bool
!\=(self, p: const DPath &) -> bool
hash(self) -> unsigned long
points=(self, p: const DPoint[] &) -> void
each_point(self) -> DPoint
num_points|#points(self) -> unsigned long
width=(self, w: double) -> void
width(self) -> double
bgn_ext=(self, ext: double) -> void
bgn_ext(self) -> double
end_ext=(self, ext: double) -> void
end_ext(self) -> double
round=(self, round_ends_flag: bool) -> void
is_round?(self) -> bool
\*(self, f: double) -> DPath
move(self, p: const DVector &) -> DPath &
move(self, dx: double, dy: double) -> DPath &
moved(self, p: const DVector &) -> DPath
moved(self, dx: double, dy: double) -> DPath
transformed(self, t: const DTrans &) -> DPath
transformed|#transformed_cplx(self, t: const DCplxTrans &) -> DPath
from_s(self, s: string) -> new DPath *
to_s(self) -> string
simple_polygon(self) -> DSimplePolygon
polygon(self) -> DPolygon
perimeter(self) -> double
area(self) -> double
length(self) -> double
bbox(self) -> DBox
_unmanage(self) -> void
_manage(self) -> void
_create|#create(self) -> void
_destroy|#destroy(self) -> void
dup(self) -> new DPath *
assign(self, other: const DPath &) -> void
_destroyed?|#destroyed?(self) -> bool
_is_const_object?|#is_const_object?(self) -> bool
Which is pretty nice. This method has the advantage of being completely agnostic to the docstring, it seems. Translating the argument types should be straightforward. One issue is that it fails to capture all the logic inside make_classes
which renames some method names and also deals with synonyms. So we'd have to duplicate the logic outside in python as well. What do you think?
Hi Thomas,
Very good, that is encouragingly simple.
Regarding the synonyms - I mainly use that to deprecate methods and to correct typos. Old versions still exist, but the are not shown in the help texts. For generation of the stubs I'd propose to simply iterate over the synonyms and produce a stub for each.
I have some more remarks or questions:
- Do the stubs need to know if a method is a property setter or getter? In this case there are flags which can be evaluated
- Some name translation happens during generation of Python bindings. For example "+" becomes "add" and "exec" becomes "exec_" (reserved word). "property=" becomes "set_property".
For details about name translation you can check this file: https://github.com/KLayout/klayout/blob/master/src/pya/pya/pyaModule.cc
Line 470++: list of reserved words (an underscore is appended) Line 511++: translation of names
Best regards,
Matthias
Hi Matthias,
I am aware of the pyaModule.cc
, but are you comfortable duplicating that code outside as well? It would duplicate the maintaining. Another idea: what if we create the stub file from pyaModule
itself instead of using the TL module? Then it would share the same methods.
Regarding your remarks:
- This is our choice I think. I can say what mypy does. If a property
name
is readonly, it annotates like so:
@property
f'def {name}(self) -> {type}: ...'
If it is read and write, then it annotates as a normal variable"
f'{name}: {type}'
I noticed that the community doesn't like variables that are 'write-only'. Are there variables like that? If there are, I just propose to pretend it is a RW variable for type hinting purposes. BTW, if it's a class variable, we need to do
f'{name}: ClassVar[{inferred}] = ...'
- That's fine. So long as the stub matches the pyaModule's translation. The good news is that there is a way to check the validity of stubs with mypy's
stubtest
, though I haven't tested yet.
LG Thomas
Hi Thomas,
Regarding write-only properties I am trying to avoid them as well, but maybe they slipped in. If they did, I'd suggest to turn them into methods like "set_property" and not make them "@property".
About code duplication: that is usually not a good idea but here it is about language specific details (which I guess won't change that quickly :) ) and sharing code across Python and C++ is notoriously difficult, so think accepting the duplication is just pragmatic in this case.
Best regards,
Matthias
Hi Matthias,
I made some progress in translating the types (See commit).
I have a question regarding properties. How do I use the tl.Class
mechanism to figure out whether a method is a property or not? And to tell whether it is a setter/getter? Do we just go by name? I wasn't able to figure that out.
Also, what is a method variant? How is it different than a synonym?
Thanks, Thomas
Hi Thomas,
this needs a little more explanation, I see.
First of all, I should say that I'd only provide stubs for the "primary method". Among the synonyms (MethodSynonym objects delivered by "each_overload") there is one matching the "primary_name" of the Method object. I'd ignore the other ones as they are usually deprecated versions.
Furthermore, I'd ignore signals as they do not have arguments and are neither setters nor getters.
Regarding the getter detection: there is a "is_getter" attribute in MethodSynonym, but that is rather a hint. KLayout's Python binding will generate getters also if there is a setter with the same name, the method accepts has no arguments and itself is not the setter. This boils down do an extended definition of "is_getter" derived like this:
def primary_synonym(m):
for ms in m.each_overload():
if ms.name() == m.primary_name():
return ms
raise("Primary synonym not found for method " + m.name())
... c is a class ...
setters = set()
for m in c.each_method():
if m.is_signal():
continue
method_def = primary_synonym(m)
if method_def.is_setter():
setters.add(method_def.name())
for m in c.each_method():
num_args = len([ True for a in m.each_argument() ])
method_def = primary_synonym(m)
# extended definition of "getter" for Python
is_getter = num_args == 0 and (method_def.is_getter() or (not method_def.is_setter() and method_def.name() in setters))
print(" " + method_def.name() + " setter=" + str(method_def.is_setter()) + " getter=" + str(is_getter))
Regarding the variants: many methods exist with the same name, but different argument types.
How do the stubs handle that? Can there be multiple stubs with the same name or is this kind of polymorphism not allowed in mypy in general?
Best regards,
Matthias
I haven't had a chance to test your suggestion, but I'd like to answer your question. If you analyze the dbcore.pyi
file from the PR, you'll see loads of "overloads", or as you call, variants. For example:
class Box:
...
@overload
def transformed(self, t: ICplxTrans) -> Box:
"""
transformed(t: ICplxTrans) -> Box
@brief Transforms the box with the given complex transformation
@param t The magnifying transformation to apply
@return The transformed box (in this case an integer coordinate box)
This method has been introduced in version 0.18.
"""
@overload
def transformed(self, t: Trans) -> Box:
"""
transformed(t: Trans) -> Box
@brief Returns the box transformed with the given simple transformation
@param t The transformation to apply
@return The transformed box
"""
@overload
def transformed(self, t: CplxTrans) -> DBox:
"""
transformed(t: CplxTrans) -> DBox
@brief Returns the box transformed with the given complex transformation
@param t The magnifying transformation to apply
@return The transformed box (a DBox now)
"""
Hi @klayoutmatthias ,
I have almost completed the new stubgen program based on your suggestion (check the commit above). It got a bit out of hand because I needed to inspect class variables inside other classes. For example: CompoundRegionOperationNode.LogicalOp
.
If you want to run it, run stubgen.py
inside src/pymod
.
There is a bug I'd like to ask you about. In the script you sent, the method_def.name()
in the case of the *=
operator in Point/DPoint/Vector/DVector
is actually returning =
. If you actually do m.name()
, I believe it returns *\\=
. I am assuming this is unintended behavior, but I didn't know how to fix it. Can you take a look?
Best, Thomas
Thanks a lot and also thanks informing me about the "*=" problem.
That is a kind of bug - *
in the method's annotated m.name()
means "protected", so *=
was read as "protected =
". Hence the backslashes and the *
being dropped.
I am currently preparing 0.27.11 and will include a small patch that fixes this problem.
I forgot about the child class traversal, I'm afraid. If you look closely at the implementation, you'll see that child classes are not only made available inside their parent class (e.g. "CompoundRegionOperationNode.LogicalOp"), but also on the top level as "CompoundRegionOperationNode_LogicalOp" (note the underscore). Please ignore this second version. It's a legacy concept from times when I did no understand how to generate child classes in Ruby or Python. I want to drop that in the future.
I'm also sorry about the messy code for the getter detection. That stems from times when Ruby was the only language and inside Ruby, getters are simply methods without arguments. So no need to identify getters. When I implemented Python, I should have first polished the is_getter attribute instead of hacking around with the information I had at hand. I want to change this in a way that is_getter reliably tells you the truth from the Python perspective. But that's a too big change to go into a stable 0.27 branch.
Best regards,
Matthias
No worries! Thanks for clarifying. I did not find a way to iterate child classes in the tl.Class object. So I decided to collect a set of "unknown" classes that appeared while iterating through method arguments or returns. This was my workaround. Having an Class.each_class
iterator would be the ideal solution.
Please let me know when you create a patch, so I can apply here and close the PR.
Hi @klayoutmatthias
Thanks for the patch on the *=
method. I ended up implementing two extra methods in tl.Class
: each_child_class
and parent
, which then helped me to complete the stubgen.py script. See results in dbcore.pyi
tlcore.pyi
and rdbcore.pyi
. I generated them based on version 0.27.11, but have not integrated with CI. I think they should be computed semi-manually for every release. Check Typehint Stubs Notes for some notes.
Note: you'll note that I took the liberty to change db.GenericDeviceExtractor.name
to db.GenericDeviceExtractor.set_name
, because the base class already had a name method (which is incompatible with a property), creating a bug in type checking.
I think it's ready for merging. It will be a great addition to 0.27.12.
Best, Thomas
Hi Thomas,
thanks a lot for this patch and the substantial bug fixes, specifically the embarrassing typos :( Doesn't shed a good light on test coverage :( I really appreciate your efforts!
Regarding the GenericDeviceExtractor: I'd prefer having a read/write attribute "name". Reason is that I already published code using "name=". I think it's only missing the getter, so let me add it and use "name=" again.
There is a small concern on mine regarding 0.27: usually I try not to touch much as my promise is to keep this branch stable. I think your code does not impose any risk here, but I'd like to give it a small review before I merge. Specifically the set_name on GenericDeviceExtractor I'd like to fix.
Best regards and many thanks,
Matthias
Hi Matthias,
I have cleaned up (reverted) the code in pyaModule.cc
that was left unused during refactoring to make your review easier.
thanks a lot for this patch and the substantial bug fixes, specifically the embarrassing typos :( Doesn't shed a good light on test coverage :( I really appreciate your efforts!
Very few typos, in my opinion. Just a few inconsistencies that are only relevant for the python (maybe ruby?) GSI.
Regarding the GenericDeviceExtractor: I'd prefer having a read/write attribute "name". Reason is that I already published code using "name=". I think it's only missing the getter, so let me add it and use "name=" again.
I agree. But if you do that, then I think it's best to change base class's DeviceExtractorBase.name
to a property.
There is a small concern on mine regarding 0.27: usually I try not to touch much as my promise is to keep this branch stable. I think your code does not impose any risk here, but I'd like to give it a small review before I merge. Specifically the set_name on GenericDeviceExtractor I'd like to fix.
Of course. I figured 0.27 is your stable branch and master is the unstable one which doesn't seem connected to 0.27. We don't seem to have a stable development branch.
Feel free to review and let me know if you want me to squash all commits to a single commit message so that we can reduce pollution in git history.
Hi Thomas,
For GenericDeviceExtractor the problem was simply that "name=" was in the wrong class. I moved it and now it should be consistent. I have placed one commit into your branch - I don't know how to submit a PR on a PR, so I took the liberty to do so :)
Apart from that I do not have any issues merging the branch. I'd like to do a squash commit so it's easier to cherry-pick for master in a single commit. Green light from your side?
Thanks,
Matthias
Hi Matthias,
For GenericDeviceExtractor the problem was simply that "name=" was in the wrong class. I moved it and now it should be consistent. I have placed one commit into your branch - I don't know how to submit a PR on a PR, so I took the liberty to do so :)
This was the intention! Thanks for letting me have branches in the main repo.
Apart from that I do not have any issues merging the branch. I'd like to do a squash commit so it's easier to cherry-pick for master in a single commit. Green light from your side?
I just tested your changes and recompiled the .pyi
files. OK to merge. I recommend using the "Squash and merge" button so that I'm counted as a contributor.
data:image/s3,"s3://crabby-images/cbf72/cbf72efa029b2a78299d82b63fd7f1b059c0fe30" alt="image"
Best, Thomas
I just merged. Sorry for the delay - I was busy last week.
I'll try to merge into master and close the ticket then.
Again, thanks for your efforts!
Matthias
Thanks for merging! I believe this closes #1092 .