rdflib
rdflib copied to clipboard
`transitive_subjects` and `transitive_objects` return the starting node as first element
Both transitive_subjects
and transitive_objects
return the starting node as first element. This means that they do not behave intuitively, nor is the behavior according to what's described in the docstring.
A bit more detail from the example by @jjon in #1303:
>>> pprint(list(cg.transitive_subjects(RDF.type, pome.Person)))
[rdflib.term.URIRef('http://prosopOnto.medieval.england/2006/04/pome#Person'),
rdflib.term.URIRef('http://example.com/thisgraph#Hugh_Despenser'),
rdflib.term.URIRef('http://example.com/thisgraph#Audley_Henry_de'),
rdflib.term.URIRef('http://example.com/thisgraph#Thomas_earl_of_Warwick_d_1242'),
.
.
. etc.
]
The transitive_subjects
method yields pome:Person
even though that's not a subject of a triple with rdf:type
as predicate and pome:Person
as object.
In issue #1303 @white-gecko suggests that you can "just skip the first element when working with the list" but this essentially means that any implementation that uses one of these methods will have to skip the first element.
Suggested fixes:
- Update the code to make it behave as expected (this may be non-trivial given that the existing behaviour is due to the anti-recursive check)
- If that fails, update the docstring to reflect actual behaviour.
https://github.com/RDFLib/rdflib/blob/e09ce43f2844d0b0f96ec5b976015901f9268873/rdflib/graph.py#L1141-L1181