numpydoc
numpydoc copied to clipboard
improve section detection
Potentially fixes #316.
This tries to allow parsing sections which are not separated by blank lines (there should probably be a warning in that case, I'll add that once the general idea has been approved). In order to get that to work used a few tricks (e.g. add a optional doc
parameter to _is_at_section
to allow calling it on a different reader) so it might need some refactoring before being truly ready.
cc @Carreau, my main motivation was trying to get velin
to auto-fix this
yielding StopIteration
seems like a bug: https://github.com/numpy/numpydoc/blob/265ab91fca60192bbc56217d4bae958e326b9b1b/numpydoc/docscrape.py#L214-L219
because it will cause this to fail with a obscure error: https://github.com/numpy/numpydoc/blob/265ab91fca60192bbc56217d4bae958e326b9b1b/numpydoc/docscrape.py#L380-L381
I think this should be fixed, either by changing the yield value to yield StopIteration, StopIteration
or yield "", []
, or by raising an error instead (if that's what that was supposed to indicate).
Ping. Does anyone have any comments on this?
Ping. Does anyone have any comments on this?
I think the discussion in gh-316 shows this is not desirable. Detecting in order to raise a better warning or even an exception makes sense though.
apologies for pinging and then forgetting about it for a year.
I implemented the requested change, such that it will now warn on every missing empty line (between summary and the first section, or between two sections).
The current implementation comes with a (slight?) performance regression because NumpyDocString._is_at_section
is now called for every line, and I had to do a slightly ugly trick to make sure _is_at_section
doesn't swallow empty lines.
Instead, I could also imagine doing a two-pass implementation: find all separators (multiple -
or =
per line) and check if those belong to a section which is not preceded by an empty line in the first pass, then run the actual extraction in the second pass. That might make the code a bit simpler and potentially faster, but every line would be visited twice (I'm not sure how much of an issue that would be).
Edit: I think the CI failures are unrelated (not sure, though)