pypowsybl
pypowsybl copied to clipboard
overcoming some python strict typing problems
-
Do you want to request a feature or report a bug? Neither a bug nor a feature for a functional request, typing issues.
-
Please tell us about your environment:
- Platform: Windows-10-10.0.19042-SP0
- Python: 3.9.10 (tags/v3.9.10:f2f3f53, Jan 17 2022, 15:14:21) [MSC v.1929 64 bit (AMD64)] on win32
- pypowsybl: 0.12.0
- Dev environment: VSCode 1.63.2, Pylance v2022.1.3
-
typing: I use and recommend the use of python strict type checking, for it prevents many bugs and adds autodocumentation. I faced some typing issues using pypowsybl. To overcome them I created a local
ElementType(Enum)
class and apn.Network
subclass. Hereafter they are and the explaination follows:
from enum import Enum, auto # https://docs.python.org/3/library/enum.html
import gzip # https://docs.python.org/3/library/gzip.html
from os.path import abspath # https://docs.python.org/3/library/os.path.html
from pathlib import Path # https://docs.python.org/3/library/pathlib.html
from typing import Union, Any # https://docs.python.org/3/library/typing.html
from pandas import DataFrame # https://pandas.pydata.org/docs/reference/frame.html
import _pypowsybl # (non recommended hack) https://www.powsybl.org/pages/documentation/grid/model/
import pypowsybl.network as pn # https://pypowsybl.readthedocs.io/en/stable/reference/network.html
class ElementType(Enum):
BATTERY = auto()
BUS = auto()
BUSBAR_SECTION = auto()
CURRENT_LIMITS = auto()
DANGLING_LINE = auto()
GENERATOR = auto()
HVDC_LINE = auto()
LCC_CONVERTER_STATION = auto()
LINE = auto()
# ... and all others ...
class Network(pn.Network):
def __init__(self, path: Path, parameters: Union[None, dict[str, Any]]=None) -> None:
"""
Loads a pypowsybl network from a file:
- path (str): the path of the file to load
- parameters (dict, optional): a map of parameters
NOTE 1: The format of the network description is discovered from the file extension (e.g. .xiidm)
NOTE 2: If the file is gzip-compressed (e.g. .xiidm.gz), the network is loaded without an intermediate decompressed file.
"""
print(f'Loading network from path({path}) ...')
if path.suffix.lower() == '.gz':
with gzip.open(abspath(path), 'rt', encoding='utf_8_sig') as text_io:
super().__init__(_pypowsybl.load_network_from_string(path.stem, text_io.read(), parameters or {})) # type: ignore
# '.gz' extension has to be removed, otherwise the network format isn't recognized
# and the following exception is raised:
# _pypowsybl.PyPowsyblError: Unsupported file format or invalid file.
else:
super().__init__(_pypowsybl.load_network(abspath(path), parameters or {})) # type: ignore
print(str(self))
return
def get_elements(self, element_type: ElementType) -> DataFrame:
"""
Overloads and calls the same method of the superclass pn.Network.
The difference is the use of the locally defined enumerated 'ElementType'.
"""
pn_element_type = next((x for x in pn.ElementType.__members__.values() if x.name == element_type.name)) # type: ignore
return super().get_elements(pn_element_type) # type: ignore
def elements_count(self) -> DataFrame:
"""
Returns a dataframe with the elements count of each ElementType.
"""
df = DataFrame({'count': [len(self.get_elements(x)) for x in ElementType.__members__.values()] },
index=list(ElementType.__members__))
df.index.name = 'ElementType'
return df
def main() -> None:
network: Network = Network(Path('20220101T0000Z_20220101T0000Z_pf.xiidm.gz'))
print(network.elements_count())
lines: DataFrame = network.get_elements(ElementType.LINE)
print(lines)
return
@main()
1/ Why ElementType(Enum)
?
pn.ElementType
or pn.ElementType.BUS
are both accessible respectively as <class '_pypowsybl.ElementType'>
and <ElementType.BUS: 0>
. However the definition of the latter is not accessible to the type checker and an error is reported when using any element type such as: df: DataFrame = network.get_elements(pn.ElementType.BUS)
2/ Why pn.Network
subclass?
To overload get_elements
using the ElementType(Enum)
without typing errors, whilst still using any other pypowsybl.network methods.
The new get_elements
wraps (once for all) the typing errors, hidden by # type: ignore
.
elements_count
is a basic example using get_elements
: it reports the size of the loaded network.
Note: get_elements
is NOT described in the documentation but it is a public method.
update_elements
also has a pn.ElementType
parameter and it is part of the documentation.
3/ About __init__
:
The method of the created subclass adds a feature: direct read of a .xiidm.gz
.
The method wraps and hides the typing errors from load_network_from_string
and load_network
.
The typing errors come from their signatures: parameters: dict = {}
.
dict
alone misses some information. It should probably be something like dict[str, Any]
.
Note that Using a dict
as a default parameter is not recommended: but this is beyond typing.
https://docs.quantifiedcode.com/python-anti-patterns/correctness/mutable_default_value_as_argument.html
pylint notifies this as an error in strict mode.
The recommended way is probably parameters: Union[None, dict[str, Any]]=None
(python 3.9) parameters: None | dict[str, Any] = None
(python 3.10)
4/ The file network.py
includes the class pn.Network
and also reads:
from typing import List as _List # (1)
from typing import Set as _Set # (2)
from pandas import DataFrame as _DataFrame # (3)
Use oftypint.List
(1) et typint.Set
(3) are deprecated since 3.9: they are now just list
and set
that nobody wants to make private (I am unsure if the new way to use them is available from python 3.7).
https://docs.python.org/3/library/typing.html#typing.List
https://docs.python.org/3/library/typing.html#typing.Set
From the typing point-of-view, the types of parameters and result of any public method MUST be public.
Otherwise the caller cannot type his variables, and a type checker in strict mode complains.
A signature of a public method like def my_method(self) -> _DataFrame:
is strange by essence, for it says that the result is a private type. However the good news is that the type checker understands:
from pandas import DataFrame
df: DataFrame = network.my_method()
since imported DataFrame == np._DataFrame
.
Therefore (3) only prevents from writing from np import DataFrame
, which would be anyway harmless.
Why do you make private systematically every imported object, such as : import datetime as _datetime
?
Thanks for the detailed analysis!
Just a quick answer to this part:
Why do you make private systematically every imported object, such as : import datetime as _datetime?
The purpose is to have clean namespaces: for example if we import datetime
in pypowsybl.network
, it will appear as being part of our own API, whereas it is not (you will never want to do from pypowsybl.network import datetime
, and you don't want your IDE to propose it with autocompletion, or it will be messy to use).
For now, we try to keep our namespaces tidy by importing other APIs with "private" scope. Maybe a more powerful solution, which seems broadly used, would be to move our implementations into inner modules, and build our public namespaces by importing in them only those pieces that we want to expose.
For example we could:
- Define
Network
inpypowsybl.network.core
with any import with any scope - In
pypowsybl.network.__init__
: importNetwork
class
This should enable us to keep namespaces tidy, while not making imports private.
The work in #306 and #307 should solve some of the typing problems you reported:
- we will now provide stubs for
_pypowsybl
- the integration process now includes a verification with
mypy
static checker, that all methods have type annotations, and that there is no typing problem. Note though, that it might behave slightly differently from thepyright
static type check, which I think is included in pylance. In particular, for now, arguments are implicitly considered optional when their default value is None. This may be improved in a further evolution.
Those improvements will be included in next release.
👍
Hi @PatrickBourdon
Did you have any opportunity to check if typing problems have indeed been solved in the last releases of pypowsybl, or if some remain ?
Fixed additional missing types for __members__
, following your mail feedback: #396