xgi icon indicating copy to clipboard operation
xgi copied to clipboard

Docs spiff up

Open leotrs opened this issue 3 years ago • 1 comments

  • Added a short blurb on the landing page. This was a feedback I received from someone who looked at the page and told me they have no idea what the library was even for!
  • Added a reference that seemed relevant.
  • Updated python version to 3.11.
  • Fixes #185.

leotrs avatar Oct 28 '22 14:10 leotrs

@maximelucas This PR addresses #185. Could you please confirm that the proposed solution actually solves the issue to your satisfaction?

The old behavior was:

>>> H.degree?
Signature: H.degree(node=None, *args, **kwargs)
Docstring: <no docstring>
File:      ~/code/xgi/xgi/classes/hypergraph.py
Type:      function

The new behavior is:

>>> H.degree?
Signature: H.degree(node=None, *args, **kwargs)
Docstring:
Equivalent to H.nodes.degree.asdict(). For accepted *args and
**kwargs, see documentation of H.nodes.degree.
File:      ~/code/xgi/xgi/classes/hypergraph.py
Type:      function

>>> H.size?
Signature: H.size(node=None, *args, **kwargs)
Docstring:
Equivalent to H.edges.size.asdict(). For accepted *args and
**kwargs, see documentation of H.edges.size.
File:      ~/code/xgi/xgi/classes/hypergraph.py
Type:      function

leotrs avatar Oct 28 '22 14:10 leotrs

I like it, thanks Leo. I'm thinking now we might need to go further. Because if a new user starts looking H.nodes.degree they will have a hard time finding it in the docs:

  • if they look for it in classes, Hypergraph.py doesn't say anything about stats and has no pointers: https://xgi.readthedocs.io/en/latest/api/classes/xgi.classes.hypergraph.Hypergraph.html#xgi.classes.hypergraph.Hypergraph.edges
  • if they know to look in the stats packages (how would they?), they might end up here: https://xgi.readthedocs.io/en/latest/api/stats.html, which clearly contains more info. But the new user has to make a conceptual jump by understanding that information about H.nodes.degree is actually described in xgi.stats.nodestats.degree(H), which looks pretty different. And there is no example to make that particular link explicit (there are somewhere else above on the page).

I'm not sure how yet, but I'm realising we might need to make the link between these things much more explicit for new users. So far, the best "explanation" of how stats work and what they are is probably the tutorial.

maximelucas avatar Oct 31 '22 12:10 maximelucas

Okay I realise now that the user can get some info from

>>> H.nodes.degree?
Signature:   H.nodes.degree(*args, **kwargs)
Type:        NodeStat
String form: NodeStat('degree')
File:        ~/WORK/SCIENCE/xgi/xgi/stats/__init__.py
Docstring:  
An arbitrary node-quantity mapping.

`NodeStat` objects represent a mapping that assigns a value to each node in a
network.  For more details, see the `tutorial
<https://github.com/ComplexGroupInteractions/xgi/blob/main/tutorials/Tutorial%206%20-%20Statistics.ipynb>`_.

The docstring describes NodeStat (rather than the degree stat) and points to the tutorial. Maybe this is already enough for this PR. I do feel like we need to make this better in the docs too, even if with another PR.

maximelucas avatar Oct 31 '22 13:10 maximelucas

Oh that's a good point. The docstring of xgi.stats.nodestats.degree is never really being exposed to the user. The user is not even meant to know that degree lives in xgi.stats.nodestats.

We could try to merge the two docstrings somehow but I'm not sure what's the best/most efficient way of doing this. In particular, a big problem is that H.nodes(node_subset).degree(*args, **kwargs).asdict() is equivalent to xgi.stats.nodestats.degree(H, node_subset, *args, **kwargs). So the docstring for the latter contains documentation for parameters H and node_subset which are not part of H.nodes.degree. So we can't just copy-paste the docstrings. We'll have to change something more fundamental I think (or write docstrings twice...).

leotrs avatar Oct 31 '22 13:10 leotrs

How is the equivalence between H.nodes().degree().asdict() and xgi.stats.nodestats.degree(H) implemented in the code?

I think:

  1. the docs of H.nodes and H.edges should mention the stats and point to the relevant documentation (maybe the stats package?).
  2. the stats package documentation page should explicitly mention the equivalence.
  3. the docs of H.degree (and other stats) should at least point to the docs of H.nodes.degree (which it now does)
  4. but then the docs of H.nodes.degree should be explicit, mention the equivalence with the stats package syntax, and be findable also online.

If there is a way to solve the last point without writing the docstrings twice, great. If not, I suggest we write it twice (at least for now).

maximelucas avatar Oct 31 '22 15:10 maximelucas

How is the equivalence between H.nodes().degree().asdict() and xgi.stats.nodestats.degree(H) implemented in the code?

The short answer is that each NodeStat object is created with a reference to one of the functions in xgi.stats.nodestats. For example, the H.nodes.degree object contains a reference to xgi.stats.nodestats.degree.

The long answer is, well, long. When a user executes H.nodes.degree for the first time a number of things happen:

  1. First, H.nodes tries to search for a degree method, but doesn't find one. So H.nodes.__getattr__ is executed.
  2. This just calls H.nodes._dispatcher.__getattr__, which actually searches for a function called degree inside the xgi.stats.nodestats module.
  3. Once the degree function is found, a NodeStat object is built that wraps around it (this is in line 83 of xgi.stats.__init__.py. In particular, this NodeStat object stores thedegree function as self._func.
  4. The dispatcher object H.nodes._dispatcher will then remember this NodeStat object, so next time the user accesses H.degree, the same object is returned and the search doesn't happen again
  5. So from now on, typing H.nodes.degree just returns a NodeStat object.
  6. Doing H.nodes.degree.asdict() eventually accesses H.nodes.degree._val, which in turn calls self.func(self.net, self.view.ids, *self.args, **self.kwargs), where self.net is the same as H.
  7. The same happens with any other call such as H.nodes.degree.aslist() or H.nodes.degree.asnumpy(), etc. They all just access self._val, which calls self._func.

(Note: writing this all down makes me realize that we can simplify this entire pipeline a great deal in at least two different ways, so I'll work on changing that soon!)

(Note: this has been simplified a bit by #209 )

  1. the docs of H.nodes and H.edges should mention the stats and point to the relevant documentation (maybe the stats package?).

As you point this out, I realize that the docs of H.nodes are... terrible:

H.nodes?
Type:        property
String form: <property object at 0x7f9c7aa7b4f0>
Docstring:   A :class:`NodeView` of this network.

This is because H.nodes is a property and the actual docs that should be displayed live in the private member:

In [26]: H._nodeview?
Signature:      H._nodeview(bunch)
Type:           NodeView
String form:    []
Length:         0
File:           ~/code/xgi/xgi/classes/reportviews.py
Docstring:     
An IDView that keeps track of node ids.

Parameters
----------
hypergraph : Hypergraph
    The hypergraph whose nodes this view will keep track of.
bunch : optional iterable, default None
    The node ids to keep track of.  If None (default), keep track of all node ids.

Notes
-----
In addition to the methods listed in this page, other methods defined in the `stats`
package are also accessible via the `NodeView` class.  For more details, see the
`tutorial
<https://github.com/ComplexGroupInteractions/xgi/blob/main/tutorials/Tutorial%206%20-%20Statistics.ipynb>`_.
Call docstring:
Filter to the given bunch.

Parameters
----------
bunch : Iterable
    Iterable of IDs

Returns
-------
IDView
    A new view that keeps track only of the IDs in the bunch.

As you can see, these latter docs actually DO mention and point to the tutorial! Perhaps we should add another link to the stats package, and make sure that H.nodes gives better docs.


  1. the stats package documentation page should explicitly mention the equivalence.
  2. the docs of H.degree (and other stats) should at least point to the docs of H.nodes.degree (which it now does)
  3. but then the docs of H.nodes.degree should be explicit, mention the equivalence with the stats package syntax, and be findable also online.

Agreed on all three.


If there is a way to solve the last point without writing the docstrings twice, great. If not, I suggest we write it twice (at least for now).

How about something like this:

H.nodes.degree?
Signature:   H.nodes.degree(*args, **kwargs)
Type:        NodeStat
String form: NodeStat('degree')
File:        ~/code/xgi/xgi/stats/__init__.py
Docstring:  
A node mapping with name "degree".

Executing `H.nodes.degree(*args, **kwargs).asdict()` is equivalent to
`xgi.stats.nodestats.degree(H, *args, **kwargs)`. The available *args
and **kwargs are listed in the docstring of `xgi.stats.nodestats.degree`.
Said docstring is appended at the end of this docstring for reference.

For more details, see the `tutorial
<https://github.com/ComplexGroupInteractions/xgi/blob/main/tutorials/Tutorial%206%20-%20Statistics.ipynb>`_.

What follows is the docstring of `xgi.stats.nodestats.degree`.

Node degree.

The degree of a node is the number of edges it belongs to.

Parameters
----------
net : xgi.Hypergraph
    The network.
bunch : Iterable
    Nodes in `net`.
order : int | None
    If not None (default), only count the edges of the given order.
weight : str | None
    If not None, specifies the name of the edge attribute that determines the weight
    of each edge.

Returns
-------
dict

Or something along those lines? We can do this fairly easily. Anything more complicated will require some major redesigining.

leotrs avatar Nov 01 '22 07:11 leotrs

The long answer is, well, long. When a user executes H.nodes.degree for the first time a number of things happen:

Note: your PR #209 proposes to simplify that.

As you can see, these latter docs actually DO mention and point to the tutorial! Perhaps we should add another link to the stats package, and make sure that H.nodes gives better docs.

Oh okay that's where it is. But yes, it should also be available from H.nodes which is the only one exposed to the user. And add a link to the stats package sounds good too.

How about something like this:

Nice, looks much better. Is this hard-coded copy-paste or you found another clever way? If just copy-pasted, why not just adapt the docs as follows:

H.nodes.degree?
Signature:   H.nodes.degree(*args, **kwargs)
Type:        NodeStat
String form: NodeStat('degree')
File:        ~/code/xgi/xgi/stats/__init__.py
Docstring:  
A node mapping with name "degree".

The degree of a node is the number of edges it belongs to.

Executing `H.nodes.degree(*args, **kwargs).asdict()` is equivalent to
`xgi.stats.nodestats.degree(H, *args, **kwargs)`. The available *args
and **kwargs are listed in the docstring of `xgi.stats.nodestats.degree`.

For more details, see the `tutorial
<https://github.com/ComplexGroupInteractions/xgi/blob/main/tutorials/Tutorial%206%20-%20Statistics.ipynb>`_.

Parameters
----------
bunch : Iterable
    Nodes in `net`.
order : int | None
    If not None (default), only count the edges of the given order.
weight : str | None
    If not None, specifies the name of the edge attribute that determines the weight
    of each edge.

Returns
-------
dict 

maximelucas avatar Nov 02 '22 09:11 maximelucas

Nice, looks much better. Is this hard-coded copy-paste or you found another clever way? If just copy-pasted, why not just adapt the docs as follows:

What I posted is possible by just doing something like func.__doc__ = some_str_here. So we can easily append the docstring of one function to that of some other function. Modifying each docstring is much much more difficult tho.

leotrs avatar Nov 02 '22 10:11 leotrs

I did not know that. Great then, I like your solution.

maximelucas avatar Nov 02 '22 10:11 maximelucas