rdflib icon indicating copy to clipboard operation
rdflib copied to clipboard

Fix Liskov substitution principle violations in sub-classes of Graph

Open aucampia opened this issue 2 years ago • 6 comments

We have many liskov substitution principle violations in the Graph class hierarchy, ideally this should be fixed in the next major version.

aucampia avatar Jul 15 '22 12:07 aucampia

I'd be grateful if you could reference an example, just to ground my model.

ghost avatar Jul 15 '22 13:07 ghost

One example:

  • Graph.add: https://github.com/RDFLib/rdflib/blob/65ccae5f3302d868a90fc7177d078b76ce50db40/rdflib/graph.py#L451
  • ConjunctiveGraph.add https://github.com/RDFLib/rdflib/blob/65ccae5f3302d868a90fc7177d078b76ce50db40/rdflib/graph.py#L1762-L1768

If someone calls graph.add(triple=...) then it will work if graph is Graph, but not if it is a ConjunctiveGraph as the argument name for add is different in ConjunctiveGraph, and there someone will have to call graph.add(triple_or_quad=...). There are more cases like this, mypy does warn about them if the methods have typing.

Two other examples:

https://github.com/RDFLib/rdflib/blob/65ccae5f3302d868a90fc7177d078b76ce50db40/rdflib/graph.py#L2197-L2212

https://github.com/RDFLib/rdflib/blob/65ccae5f3302d868a90fc7177d078b76ce50db40/rdflib/graph.py#L2214-L2219

aucampia avatar Jul 15 '22 17:07 aucampia

Okay, thanks for the specifics. Am I correct in my understanding that you want Dataset to not inherit from Graph?

That would be one option and possibly the best, possibly we can have some shared base class for Graph and Dataset though, and there are various workarounds we can take, but in general all inheritance should adhere to the Liskov substitution principle.

This is not the most serious issue, but just something we should keep in mind.

aucampia avatar Jul 15 '22 18:07 aucampia

in general all inheritance should adhere to the Liskov substitution principle

If it's appropriate for the context, which is probably the case here and if it doesn't negatively impact the API either in terms of performance or usability¹

... possibly we can have some shared base class for Graph and Dataset though

That would be Store, I guess :smile:

¹ Object Oriented Programming is an expensive disaster which must end

ghost avatar Jul 15 '22 19:07 ghost

in general all inheritance should adhere to the Liskov substitution principle

If it's appropriate for the context, which is probably the case here and if it doesn't negatively impact the API either in terms of performance or usability¹

It is fairly ingrained in python though, isinstance checks "if the object argument is an instance of the classinfo argument, or of a (direct, indirect, or virtual) subclass thereof". If classes do not adhere to LSP then isinstance is not useful and type checking has to be done with identity, which is also against PEP-8's recommendation which states that "Object type comparisons should always use isinstance() instead of comparing types directly" [ref].

aucampia avatar Jul 15 '22 21:07 aucampia

A-ha, you started me thinking (sometimes it happens :smile:) and this profoundly impacts the Dataset re-work in that some of problems I've been wrestling with in the implementation actually originate from the LSP violations you have identified.

As I've observed in the Discussion, Dataset as a sub-class of Graph makes no sense in the domain model, let alone being the source of so many LSP violations and elsewhere, LSP violations are clearly signalled - such as the new Exceptions raised for ReadOnlyGraphAggregate, a subclass of Graph.

ATM, I'm looking at a MixinClass solution, that seems to reflect the design semantics better and, as an approach, seems to be quite amenable to static typing

ghost avatar Jul 17 '22 10:07 ghost