jena
jena copied to clipboard
rdfdiff
Version
4.9.0
What happened?
rdfdiff is not functioning as expected. will issue a PR with suggested fixes & tests.
Relevant output and stacktrace
No response
Are you interested in making a pull request?
Yes
see PR #1989
Bug reports should contain actual steps to reproduce the issue - "not functioning as expected" - is not a useful bug report. Please provide steps, including sample data, that show scenario(s) where two input graphs don't yield the expected diff
Latest changed to 4.9.0 ... so it is always valid in the future :-)
Bug reports should contain actual steps to reproduce the issue - "not functioning as expected" - is not a useful bug report. Please provide steps, including sample data, that show scenario(s) where two input graphs don't yield the expected diff
You are right - apologies. I should've waited with this issue until I had enough time to elaborate.
The issues I found: (lines refer to original code)
-
Any statements with blank nodes are seen as concrete (lines 120, 135). (Node_Blank is a subclass of Node_Concrete, and thus returns true for #isConcrete, which I think is the issue here.)
-
A subgraph from the second set should be added to the candidates list, not the subgraph from first set that is being compared with (line 180).
-
For a statement's first non-encountered bnode, it creates a new subgraph and adds the statement and its forward closure to it (lines 228-235).
However, this does not support cases where multiple nodes end at the same bnode. If you run on testing/cmd/rdfdiff_1.ttl and testing/cmd/rdfdiff_2.ttl (after fixing the above), you see that the "England" triple is ignored as it was not part of the closure of the bnode shared with the "America" triple.
There's a few solutions for this - my original PR implemented one potential solution, but now I'm unsure whether it's the way to go - perhaps adding a different kind of closure would be better.
Blank nodes are concrete. "concrete" covers the RDF terms that can go in an RDF graph and be printed.
The test "triple.concrete" isn't appropriate (from reading the code).