java-object-diff
java-object-diff copied to clipboard
Make it possible to diff and merge ordered collections like ArrayLists
This issue should serve as aggregator for all the requests regarding this topic.
Currently all Collections
are treated like Sets
:
- It is assumed there can be only one instance with the same identity (via
hashCode
&equals
) - Order is completely disregarded (which can't even considered as Set-like behavior, since for example
LinkedHashSet
has a specific order) - I'm not sure about this one, but aren't there collections that use
compareTo
instead ofequals
andhashCode
?
This all is difficult enough to solve for the diff part, but don't forget the merge part, where the detected changed need to be written to a new instance which may or may not be one of the instances used to create the diff.
In issue List diff is not intuitive to use #100 we have talked about it. I have created a kind of proof of concept implementation of various collection comparisons. My implementation is not as flexible as object-diff (specialized subset of features for my specialized needs) but the concept what I have implemented seems to be working (I got lots of inspiration from object-diff :) :
e.g:
So referring to concepts in #100:
Ordered diff:
@Test
public void list_different() throws Exception {
List<Integer> working = IList(1);
List<Integer> base = IList(2);
DiffNode diff = differ.compare(working, base);
assertChanged(diff);
diff.assertPathChanged("/[0]");
}
@Test
public void list_different_removal() throws Exception {
List<Integer> working = IList(1);
List<Integer> base = IList(1, 2);
DiffNode diff = differ.compare(working, base);
diff.assertPathRemoved("/[1]");
}
@Test
public void list_different_additon() throws Exception {
DiffNode diff = differ.compare(IList(1, 2), IList(1));
diff.assertPathAdded("/[1]");
}
UnorderedDiff:
@Test
public void eliminated_differences() throws Exception {
DiffNode diff = differ.compare(IList(2, 2, 1, 1, 2, 1), IList(1, 2));
diff.assertUnchanged();
}
@Test
public void same_size_different() throws Exception {
DiffNode diff = differ.compare(IList(1, 2, 3), IList(1, 2, 4));
diff.assertChanged();
diff.assertPathAdded("/<ADDED>", 3);
diff.assertPathRemoved("/<REMOVED>", 4);
}
CustomKeyDiff:
@Test
public void identify_matching_pairs_independenty_from_order() throws Exception {
DiffNode diff = differ.compare(
IList(new A(1, "a"), new A(3, "a"), new A(2, "a"))
, IList(new A(3, "a"), new A(1, "b"), new A(2, "a"))
);
diff.assertChanged();
diff.assertPathChanged("/{i=1}.s");
}
@Test
public void detect_addition_and_emoval_based_on_key() throws Exception {
DiffNode diff = differ.compare(
IList(new A(1, "a"), new A(4, "a"), new A(2, "a"))
, IList(new A(1, "a"), new A(2, "a"), new A(5, "a"))
);
diff.assertChanged();
diff.assertPathAdded("/<ADDED>", new A(4, "a"));
diff.assertPathRemoved("/<REMOVED>", new A(5, "a"));
}
}
It also possible to configure by path within the object graph:
Map<String, CollectionComparisonStrategy> strategiesByPath = new HashMap<String, CollectionComparisonStrategy>();
strategiesByPath.put("/termRefs", new IndexedComparisonStrategy());
strategiesByPath.put("/babSet", new UnorderedCollectionComparisonStrategy());
strategiesByPath.put("/assSet", new CustomKeyCollectionComparisonStrategy(new Key("i")));
ComparisonStrategyResolver resolver = new ComparisonStrategyResolver(strategiesByPath);
differ.setComparisonStrategyResolver(resolver);
And the path is matched against collection attributes too:
strategiesByPath.put("/ccc.termRefs", new UnorderedCollectionComparisonStrategy());
where ccc
is collection and ternRefs
a collection too. And path is matched against any of termRefs
Implementation is available: https://github.com/takacsot/toolbox/tree/master/src/main/java/eu/qualityontime/diff
Test case: https://github.com/takacsot/toolbox/blob/master/src/test/java/eu/qualityontime/diff/ObjectDifferTest.java
Good job! Especially the ordered strategy raises some interesting questions. Your current implementation reports positional changes as REMOVE
or ADD
. However, I think for a more generic approach it would be useful to indicate a MOVE
, in order to run a fine gained comparison against the moved objects.
Here is the scenario I have in mind: image someone who uses the diff to trigger some business logic like removing an object from the database on REMOVE
or triggering a cascade of events on ADD
. Unless the object has really been removed or added respectively, these actions could be pretty harmful. A special MOVE
state would make it clear that the item still exists, but its position in the collection has changed.
Unfortunately a MOVE
isn't exclusive, since the object could also be CHANGED
. The Node API would need some work in order to transport this information. Either via multi-states or probably with some kind of meta state. Currently I can't imagine either one of them, although I lean towards multi-state.
Well, My opinion is that a simple index-by-index comparison and a move detection is not the same. I understand your point. I think you should consider move-detection-strategy as a simple use selected alternative.
e.g. I my specialized need index based strategy is a good choice because the order of item is so important that a MOVE detection would not give value.
On the other hand MOVE detection strategy could not give answer to the following.
Given original List(1,1,2,2)
Given working List(1,1,3,2,2)
What kind of movement to be detected?
- 2 is moved from index 2 to index 4
- 2 is moved from index 2 to index 3 AND 2 is moved from index 3 to index 4
Both answer is correct. It highly depending on the need of end user (developer).
Summary: the more strategy we have the better the product is because we could choose what we need.
(As requested in the docs, I gladly add myself to the list of people that would love to have this implemented. Trying to diff deeply nested objects like http://www.hl7.org/implement/standards/fhir/patient.html seems not doable right now.)
I would also be very interesting in this type of functionality.
My primary use for this library would be for regression testing when migrating/re-implementing services. Currently the object to compare are multi-level collections hierarchy represented by XML. So I thought a recursive approach of using ComparisonStrategy would be useful.
private static class ListComparisonStrategy implements ComparisonStrategy {
public void compare(DiffNode node, Class<?> type, Object working, Object base) {
if (working == null && base == null) {
node.setState(State.UNTOUCHED);
} else if (working == null && base != null) {
node.setState(State.REMOVED);
} else if (working != null && base == null) {
node.setState(State.ADDED);
} else if (((Collection) working).size() != ((Collection) base).size()) {
node.setState(State.CHANGED);
} else {
Iterator iterWorking = ((List) working).iterator();
Iterator iterbase = ((List) base).iterator();
while (iterbase.hasNext()) {
Object w = iterWorking.next();
Object b = iterbase.next();
ObjectDifferBuilder.startBuilding()
.comparison()
.ofType(ArrayList.class)
.toUse(new CollectionComparisonStrategy())
.and()
.build()
.compare(w, b)
.visit(new PrintingVisitor(w, b));
}
}
}
}
This allows me to recursively use ComparisonStrategy for Lists (ordered Collections).