numpydoc
numpydoc copied to clipboard
ENH: merge duplicate sections (rather than raise)
This allows the docs for matplotlib 1.5.x to continue building.
This is a follow on to https://github.com/numpy/numpydoc/pull/65
The ability to merge sections enables decorators to easily add sections to the docstring without having to pay the cost (at import time) of parsing the docstring and re-writing it.
attn @NelleV (who I know disagrees).
FYI, I am working on a patch to remove the docstring appending on matplotlib.
As @tacaswell I think this is a bad idea (apart from the fact it does allow our doc to build right now). Automatic appending of elements create docstring that are hard to read in a terminal and hard to maintain. Having two sections should never happen elsewiise.
More exactly, I think this patches makes matplotlib developers life much easier, but it a really bad patch for all of the other projects that use numpydoc as it allows a flexibility that should not be tolerated. The bare minimum would be to raise a warning (which would also break matplotlib's documentation, as warnings are not allowed).
I do not understand the hostility to this flexibility. Non-trivial decorators may add functionality which needs to be documented in the docstring of the final function, and injecting it via the decorator makes a good deal of sense.
I am still not convinced that mpl is going to drop docstring appending / parameterization.
Let's not confuse the two issues too much. There are use cases for decorators in docstrings, but the merits of those are orthogonal to this PR. There are no good reasons to have duplicate sections in docstrings (even if decorators are used), but if MPL has those now we can consider to postpone raising an exception.
This is only an issue only with numpydoc master right, after gh-65? And since there's no release of numpydoc yet with gh-65, there's no issue yet right now?
Can MPL just use numpydoc==0.6.0
in the 1.5.x maintenance branch, and fix things in master?
Even with 0.6.0 it is doing something funny in it is only keep the last version of a section in the docstring so it is still broken.
The reason there needs to be two Notes
section is that (without parsing the docstring) the decorator can not know what the last section in the wrapped functions docstring is. The only safe way to tack more text onto the docstring is to put it under a Notes
section (for example, if the last section is Returns
a paragraph of text causes parsing issues). If there is already a Notes
section it in v0.6.0 over-writes in when rendered (which is not great).
As near as I can tell the only way to allow decorators to append to the the docstring safely is to merge duplicates like this.
Why can't the decorator look for a Notes
section itself, append if there is one, and create one if there isn't (i.e. put the merge in a decorator instead of in numpydoc)? Can you point me to an example?
A decorator anyway can't just blindly append, even if numpydoc does the merge. Sections will be in the wrong order, and the docstring will be incorrectly ordered in a terminal.
https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/init.py#L1616
The decorator is adding a kwarg to the decorated function. We do not want to pay the import time cost to parse the docstrings (this happen on basically every plotting method).
I am not super concerned about duplicate / misordered sections in the terminal (in fact it was not obvious to me that there was a defined order after the summary).
The decorator is adding a kwarg to the decorated function. We do not want to pay the import time cost to parse the docstrings (this happen on basically every plotting method).
That use case makes sense. Adding the _DATA_DOC_APPENDIX
will basically put the info that should be in the Parameters
section in the Notes
section instead (at best, if we merge duplicate sections). So with your current decorator you'd get a better result when putting the two lines
data : indexable instance, optional
Replaces the keywords that follow it by ``data[<arg>]``.
(in fact it was not obvious to me that there was a defined order after the summary).
There is, it's the ordering specified in https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#id5
Long term I think numpydoc
should raise on duplicate sections, this is in almost every case a clear bug. Issuing a warning in the meantime I don't really care about either way, because with the hundreds of warnings that Sphinx typically spits out those are really hard to spot anyway.
Parsing a docstring can be done very quickly if all you are interested in is finding the notes section. Is the delay of locating the notes section really enough to make this unworkable?
From the numpydoc perspective, duplicate sections should raise errors.
I agree it would be better to inject that into the parameters section, however not all of mpl's docs are in proper numpydoc format yet so we can not assume that we have a Parameters
section or an existing Notes
section (which is why we settled on just tacking it onto the end).
It is not clear to me (and this may be my reading comprehension at fault) that the numbering in HOWTO_DOCUMENT
is prescriptive, rather than just enumerated by happenstance. That should probably get its own issue?
The are prescriptive, because that is the standard. Would you have preferred them be more flexible?
I have been reading it as flexible for a few years now, this conversation really is the first time I even considered the order of the sections mattering.
It mostly needs some extra words indicating that it is meant to be prescriptive, that is all.
I don't think numpydoc even builds if the order isn't preserved. At least, it used to be this way.
Anyhow, numpydoc should have a much proper documentation (cf #66) @stefanv We should put some students on this.
Thanks for clarifying that point.
https://github.com/numpy/numpydoc/issues/68
The SciPy docs also fail to build with current master, there's at least one docstring with two Notes sections. I think we should merge this after the addition to raise a warning on doing a section merge.
Would it help to have this explicit, where the library can from numpydoc.docscrape import merge_docstrings
and then merge_docstrings(base_doc, additional_doc)
?
I think this would be better than a warning. Also, importing numpydoc.docscrape no longer has any sphinx dependencies, or dependencies beyond standard library...
Did the rebase (which mostly reverts #95).
@jnothman what would you expect the return value of merge_docstrings
to be?
A docstring (str). We've already got NumpyDocString -> str implemented, so why not?
On 3 November 2017 at 01:58, Thomas A Caswell [email protected] wrote:
Did the rebase (which mostly reverts #95 https://github.com/numpy/numpydoc/pull/95).
@jnothman https://github.com/jnothman what would you expect the return value of merge_docstrings to be?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/numpy/numpydoc/pull/67#issuecomment-341448525, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz6wjIEt3EcB6-GMhTyiKXE9cvVoGZks5sydiDgaJpZM4KYB4d .
I would like to chime in saying that I think the ability to merge docstrings would be really cool.
I’m working on a few decorators that can make it easier to transition functions to new behaviors.
The ability to add to the docstring is quite an important feature as it would avoid writing even more boiler plate warning messages.
The docstring appending would happen at import time and not run time so I don’t think it is a big issue.
I was thinking of using the numpydoc structure, but I think making numpydoc a hard dependency of the “deprecation-factory” would be a mistake. Adding a badly formatted string (appending it) is better than adding a hard dependency.
Edit: finished my sentence