xgi icon indicating copy to clipboard operation
xgi copied to clipboard

Implement EmptyHypergraph

Open leotrs opened this issue 2 years ago • 3 comments

As per #152, certain functions do not handle empty hypergraphs well. In fact, most of the library has been written without empty hypergraphs in mind. There are a few ways to address this.

  1. Don't do anything and continue the current status quo. The result is that whenever the user wants to run anything on an empty hypergraph (i.e. remove nodes or edges, access attributes, compute statistics, etc) they will get different exceptions from different points in the codebase. This may become obnoxious and confusing.
  2. Add checks for empty hypergraphs everywhere. Basically, we can do input validation every time we need to compute anything that would otherwise crash if the underlying hypergraph was empty. This would help the user a lot and provide nicer error messages but would litter the codebase with silly checks all over the place. This will make the codebase hard to maintain, and methods' source code longer than it would otherwise.
  3. Create a class EmptyHypergraph that inherits from Hypergraph. This class would override the Hypergraph methods in two ways: a) the add_* methods call the parent class' method and then return a new Hypergraph object (not an EmptyHypergraph), and b) the remove_* methods immediately raise an exception (because nothing can be removed from an empty hypergraph). In this way, from the user perspective, we always have nice error messages, and from the developers' perspective, we do not need to litter the entire codebase with checks for empty hypergraphs. The only class that needs to care such a check is EmptyHypergraph. We should just make sure that instances of EmptyHypergraph do not have any access to any of the features that would otherwise fail if executed on an empty hypergraph.

~~This issue proposes we implement option 3.~~

EDIT: I no longer advocate for option 3, see below.

leotrs avatar Aug 31 '22 07:08 leotrs

EDIT: I no longer advocate for option 3, see below.

~~Just to be a bit more concrete, here is an example skeleton of EmptyHypergraph:~~

class EmptyHypergraph(Hypergraph):
    def add_node(self, node):
        H = Hypergraph()
        H.add_node(node)
        return H

    def remove_node(self, node):
        raise TypeError    # or some other exception

    def nodes(self):
        return None    # or an empty list, but NOT an empty NodeView

    def nodes(self):
        return None    # or an empty list, but NOT an empty NodeView

    def copy(self):
        return EmptyHypergraph()

    def dual(self):
        return EmptyHypergraph()

All other add_* and remove_* methods are given a similar treatment. This covers all methods of the Hypergraph class. The main trick here is, I think, the fact that an EmptyHypergraph has NO NodeView or EdgeView, not even an empty one. Because of this, the user never even gets access to things like computing centrality measures (as in H.nodes.degree or as in #152 ). And therefore our centrality measures and the entire stats package need not ever care about empty hypergraphs.

EDIT: there is one more necessary change that would need to be made to remove_* methods in Hypergraph, like so:

class Hypergraph:
    def remove_node(self, node):
        # .... usual code here
        if not self._node and not self._edge:
            return EmptyHypergraph()

@nwlandry @maximelucas what do yall think?

leotrs avatar Sep 01 '22 07:09 leotrs

Thanks for this Leo. Options 1 and 2 don't feel optimal.

  • right now, H.nodes.degree and all other similar stats functions would give some uninformative error like None has no method degree() or something, right? How could we raise a more informative error like Cannot compute because this is an empty hypergraph? Another option would be for H.nodes.degree and similar to return 0s everywhere.

  • If people do H = xgi.Hypergraph() and don't add any edges, it will not be an EmptyHypergraph. But I don't expect people to do

H = xgi.EmptyHypergraph()
H.add_edge([1,2]

If they don't, they will only get an EmptyHypergraph by removing all edges from a Hypergraph. Not sure what to do.

maximelucas avatar Oct 28 '22 08:10 maximelucas

Thank you for your answer.

  • right now, H.nodes.degree and all other similar stats functions would give some uninformative error like None has no method degree() or something, right? How could we raise a more informative error like Cannot compute because this is an empty hypergraph? Another option would be for H.nodes.degree and similar to return 0s everywhere.

One problem that I just thought of is that, on empty hypergraphs, some stats will work fine (e.g. node attributes), some will work but return funny answers (e.g. every node having degree zero), and some will just not work (unless each stat makes a check for the empty hypergraph...). This seems way too messy for me and I don't think we want to have three different behaviors based on whether the hypergraph is empty or not.

  • If people do H = xgi.Hypergraph() and don't add any edges, it will not be an EmptyHypergraph. But I don't expect people to do
H = xgi.EmptyHypergraph()
H.add_edge([1,2]

If they don't, they will only get an EmptyHypergraph by removing all edges from a Hypergraph. Not sure what to do.

Yes you are right. On second thought, I no longer like the solution I provided above. I'm glad I didn't implement it :sweat_smile: .

leotrs avatar Oct 28 '22 11:10 leotrs

@leotrs do we still need this or should we close it?

maximelucas avatar May 12 '23 10:05 maximelucas

Yeah, this was a bad idea.

leotrs avatar May 15 '23 12:05 leotrs