python-fints icon indicating copy to clipboard operation
python-fints copied to clipboard

Replace enum documentation hack with `enum-tools` decorator

Open lukasjuhrich opened this issue 3 years ago • 12 comments

tl;dr shave off 2s from import time and thus from every one of my pytest runs in a dependent project :-)

When the FINTS_DISABLE_DOC_HACK environment variable is present, fints will no longer do source line introspection at import time.

I do realize it's hard to add documentation to enums in a way that both appears in sphinx and in an interactive console's help. However, the drawback is an absolutely insane performance penalty:

pycroft@7f9d0d03ed66:~/app$ time python -c 'from fints import dialog'

real    0m3.179s
user    0m3.098s
sys     0m0.050s
pycroft@7f9d0d03ed66:~/app$ export FINTS_DISABLE_DOC_HACK=1
pycroft@7f9d0d03ed66:~/app$ time python -c 'from fints import dialog'

real    0m0.936s
user    0m0.901s
sys     0m0.023s
pycroft@7f9d0d03ed66:~/app$

I chose an opt-out mechanism in order to be as unobtrusive to current setups as possible.

lukasjuhrich avatar Jan 10 '22 01:01 lukasjuhrich

based on reading stuff like https://github.com/lukasjuhrich/python-fints/blob/2b264f1a3119e9544cfdd3edcbc6a48a9b7a3ba9/fints/formals.py#L583

i wonder if https://docs.python.org/3/library/enum.html#when-to-use-new-vs-init might be a better long term solution by including the documentation as string value at declaration time

RonnyPfannschmidt avatar Jan 23 '22 15:01 RonnyPfannschmidt

Polite ping to @raphaelm. Is there any blocker?

lukasjuhrich avatar Jun 12 '22 00:06 lukasjuhrich

Sorry, yeah. I wasn't aware of this problem before your PR, so thanks for the PR! I'm not sure on the solution, though, an undocumented environment opt-out doesn't really feel elegant. I was hoping to find a better solution to the root cause but there still seems no good solution for enum documentation.

However I don't think many people use .doc a lot, so I'd be fine with an opt-in that is documented and activated in the sphinx build, I guess?

@henryk Any thoughts, as this was your idea originally?

raphaelm avatar Jun 12 '22 09:06 raphaelm

Looking at that horrible hack I wrote in 2018, I think Ronny is right :) There appears to be a different, not as hackish, hack available: https://stackoverflow.com/questions/50473951/how-can-i-attach-documentation-to-members-of-a-python-enum/50473952#50473952 This would be easy to implement: Change RepresentableEnum to work like DocEnum from the stackoverflow answer, and replace all #: ... in affected enums with , "...". I'm not sure how Sphinx will react, but if it doesn't work there appears to be an extension specifically for this: https://enum-tools.readthedocs.io/en/latest/api/autoenum.html (which is where I found the stackoverflow link).

henryk avatar Jun 12 '22 18:06 henryk

Nice! @lukasjuhrich would you happen to be interested to prepare that change as well?

raphaelm avatar Jun 12 '22 18:06 raphaelm

I've just tested the proposed __new__ customization. However, sphinx autoclass mechanism does not seem to care in the slightest about __doc__ and exclusively looks at the syntax tree (i.e. a subsequent comment).

So to me it seems that the requirements

  1. Having working sphinx documentation with autodoc
  2. Having working __doc__ for use in an interactive console or with pydoc

Are mutually exclusive unless we add tooling to sphinx.

I'll look at what autoenum could do for us.

lukasjuhrich avatar Jun 13 '22 17:06 lukasjuhrich

autoenum just seems to provide an additional directive and doesn't directly modify what is initiated by automodule in the doc definitions: https://github.com/raphaelm/python-fints/blob/9e80e6bab933ec2651c34526caec360a0e000d56/docs/developer/types.rst?plain=1#L24-L31

So before I dig in any deeper, wouldn't it be a reasonable solution to just drop the item-specific __doc__ contents? Yes, it would make these docstrings invisible to pydoc, but I wouldn't say that it's too much of a hassle to open the readthedocs site.

Diving deep into sphinx customization for that little bit of functionality isn't really something I'd be up for (plus I've tried to deal with sphinx/docuitils before and it was not fun).

lukasjuhrich avatar Jun 13 '22 17:06 lukasjuhrich

I think I'd be fine with dropping __doc__ support for the enums if we still have the meaning visible in both sphinx and source code.

raphaelm avatar Jun 13 '22 18:06 raphaelm

I totally forgot I didn't finish the work here.

Good news № 1

Using @enum_tools.documentation.document_enum on our enums and replacing #: member_comment with # doc: member_comment results in pretty much equivalent doc renderings.

Before (with hack)

(src) image

After (with enum-tools)

image

Good news № 2

It is important to understand that the enum-tools.documentation.document_enum decorator essentially does the same thing as this hack—read the source lines and use the ast module to parse it: https://github.com/domdfcoding/enum_tools/blob/6a9ed9d1eb0b29eca04930692a4989aa6bd8eedb/enum_tools/documentation.py#L234-L235

…however, for some reason enum_tools seems to be much more efficient in doing so (I did not quite investigate it further, though).

Performance comparison of imports

running pyinstrument on from fints.formals import * gave the following results:

Version with `RepresentableEnum`
2.031 <module>  <string>:1
   [4 frames hidden]  <string>, runpy
      2.030 _run_code  runpy.py:63
      └─ 2.030 <module>  test.py:1
         └─ 2.006 <module>  fints/formals.py:1
            ├─ 1.811 __new__  enum.py:180
            │     [9 frames hidden]  enum, <built-in>
            │        1.805 __init__  fints/utils.py:274
            │        └─ 1.782 getsourcelines  inspect.py:1120
            │              [244 frames hidden]  inspect, ast, <built-in>, genericpath...
            │                 0.906 compile  <built-in>:0
            └─ 0.181 <module>  fints/fields.py:1
               └─ 0.170 <module>  fints/types.py:1
                  └─ 0.166 <module>  fints/utils.py:1
                     └─ 0.161 <module>  mt940/__init__.py:1
                           [163 frames hidden]  mt940, urllib, http, email, socket, e...
Version with `enum-tools`
0.598 <module>  <string>:1
   [11 frames hidden]  <string>, runpy, pkgutil, <frozen zip...
      0.584 _run_code  runpy.py:63
      └─ 0.584 <module>  test.py:1
         ├─ 0.532 <module>  fints/formals.py:1
         │  ├─ 0.274 <module>  fints/fields.py:1
         │  │  └─ 0.261 <module>  fints/types.py:1
         │  │     ├─ 0.233 <module>  fints/utils.py:1
         │  │     │  ├─ 0.205 <module>  mt940/__init__.py:1
         │  │     │  │     [230 frames hidden]  mt940, urllib, http, email, socket, e...
         │  │     │  └─ 0.024 BufferedReader.read  <built-in>:0
         │  │     │        [2 frames hidden]  <built-in>
         │  │     └─ 0.025 BufferedReader.read  <built-in>:0
         │  │           [2 frames hidden]  <built-in>
         │  ├─ 0.236 <module>  enum_tools/__init__.py:1
         │  │     [267 frames hidden]  enum_tools, pygments, re, sre_compile...
         │  └─ 0.009 BufferedReader.read  <built-in>:0
         │        [2 frames hidden]  <built-in>
         ├─ 0.025 BufferedReader.read  <built-in>:0
         │     [2 frames hidden]  <built-in>
         ├─ 0.015 open  <built-in>:0
         │     [2 frames hidden]  <built-in>
         └─ 0.008 compile  <built-in>:0
               [2 frames hidden]  <built-in>

lukasjuhrich avatar Oct 18 '22 18:10 lukasjuhrich

For transparency, I've force-pushed my master branch to incorporate the changes using @enum_tools.documentation.document_enum instead of the environment variable. The original change of this PR was https://github.com/lukasjuhrich/python-fints/commit/d2bf8b175bbf77d12b2f5e6e3d98ad02a9a7a950.

Fixing the tests now.

lukasjuhrich avatar Oct 19 '22 11:10 lukasjuhrich

Codecov Report

Merging #136 (6ba49bd) into master (9e80e6b) will decrease coverage by 0.21%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #136      +/-   ##
==========================================
- Coverage   87.78%   87.56%   -0.22%     
==========================================
  Files          26       23       -3     
  Lines        3332     3161     -171     
==========================================
- Hits         2925     2768     -157     
+ Misses        407      393      -14     
Impacted Files Coverage Δ
fints/utils.py 60.50% <ø> (-1.54%) :arrow_down:
fints/formals.py 98.61% <100.00%> (+0.08%) :arrow_up:
fints/fields.py 84.33% <0.00%> (-1.85%) :arrow_down:
fints/__init__.py
fints/message.py
fints/dialog.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Oct 19 '22 11:10 codecov[bot]

@raphaelm

  • coverage reduction in fints.utils probably due to above-average coverage of deleted lines, causing the aggregate file coverage to decrease.
  • coverage reduction in fints.fields is because the if addendum and […]-branch is not passed anymore. I'm not sure why though, because the document_enum decorator actually takes care of the __doc__.

I don't quite understand what's happening here and whether the code branches are actually still used.

https://github.com/raphaelm/python-fints/blob/9e80e6bab933ec2651c34526caec360a0e000d56/fints/fields.py#L229-L235

Tracing function usages make it look like the docstrings are somehow used for various print_nested() functions. I'm not quite sure what those do or why they need the docstrings of the enum values.

Even more weirdly, debugging this yielded the following observation: An assert IdentifiedRole.MS.__doc__ == "Message Sender" works in a python shell but not in a pytest test.

lukasjuhrich avatar Oct 19 '22 17:10 lukasjuhrich

I want to make y'all aware of the insane number of packages we'd need to add to Nixpkgs (and other distros) in order to be able to simply build enum-tools from source: https://github.com/NixOS/nixpkgs/pull/299901 They seem to all be created by a single author and, honestly, I have no clue why the build tool whey needs to have this many dependencies. Please reconsider using enum-tools.

dotlambda avatar Apr 10 '24 02:04 dotlambda

Interesting. I'm not keen on reverting to the very slow approach, but since the enum-tools is basically only needed for generating the documentation (and for using __doc__ at runtime, technically), I could also think of making it an optional dependency in some way if you can do without the docs for nixos?

raphaelm avatar Apr 10 '24 07:04 raphaelm

Yes, that would work for us.

dotlambda avatar Apr 10 '24 13:04 dotlambda

See https://github.com/raphaelm/python-fints/pull/163

raphaelm avatar Apr 10 '24 14:04 raphaelm