java-object-diff icon indicating copy to clipboard operation
java-object-diff copied to clipboard

Make it possible to diff and merge ordered collections like ArrayLists

Open SQiShER opened this issue 10 years ago • 7 comments

This issue should serve as aggregator for all the requests regarding this topic.

SQiShER avatar Sep 29 '14 22:09 SQiShER

Currently all Collections are treated like Sets:

  • It is assumed there can be only one instance with the same identity (via hashCode & equals)
  • Order is completely disregarded (which can't even considered as Set-like behavior, since for example LinkedHashSet has a specific order)
  • I'm not sure about this one, but aren't there collections that use compareTo instead of equals and hashCode?

This all is difficult enough to solve for the diff part, but don't forget the merge part, where the detected changed need to be written to a new instance which may or may not be one of the instances used to create the diff.

SQiShER avatar Sep 29 '14 22:09 SQiShER

In issue List diff is not intuitive to use #100 we have talked about it. I have created a kind of proof of concept implementation of various collection comparisons. My implementation is not as flexible as object-diff (specialized subset of features for my specialized needs) but the concept what I have implemented seems to be working (I got lots of inspiration from object-diff :) :

e.g:

So referring to concepts in #100:

Ordered diff:

  @Test
    public void list_different() throws Exception {
      List<Integer> working = IList(1);
      List<Integer> base = IList(2);
      DiffNode diff = differ.compare(working, base);
      assertChanged(diff);
      diff.assertPathChanged("/[0]");

    }

    @Test
    public void list_different_removal() throws Exception {
      List<Integer> working = IList(1);
      List<Integer> base = IList(1, 2);
      DiffNode diff = differ.compare(working, base);
      diff.assertPathRemoved("/[1]");
    }

    @Test
    public void list_different_additon() throws Exception {
      DiffNode diff = differ.compare(IList(1, 2), IList(1));
      diff.assertPathAdded("/[1]");
    }

UnorderedDiff:

 @Test
    public void eliminated_differences() throws Exception {
      DiffNode diff = differ.compare(IList(2, 2, 1, 1, 2, 1), IList(1, 2));
      diff.assertUnchanged();
    }

    @Test
    public void same_size_different() throws Exception {
      DiffNode diff = differ.compare(IList(1, 2, 3), IList(1, 2, 4));
      diff.assertChanged();
      diff.assertPathAdded("/<ADDED>", 3);
      diff.assertPathRemoved("/<REMOVED>", 4);
    }

CustomKeyDiff:

 @Test
    public void identify_matching_pairs_independenty_from_order() throws Exception {
      DiffNode diff = differ.compare(
          IList(new A(1, "a"), new A(3, "a"), new A(2, "a"))
          , IList(new A(3, "a"), new A(1, "b"), new A(2, "a"))
          );
      diff.assertChanged();
      diff.assertPathChanged("/{i=1}.s");
    }

    @Test
    public void detect_addition_and_emoval_based_on_key() throws Exception {
      DiffNode diff = differ.compare(
          IList(new A(1, "a"), new A(4, "a"), new A(2, "a"))
          , IList(new A(1, "a"), new A(2, "a"), new A(5, "a"))
          );
      diff.assertChanged();
      diff.assertPathAdded("/<ADDED>", new A(4, "a"));
      diff.assertPathRemoved("/<REMOVED>", new A(5, "a"));
    }
  }

It also possible to configure by path within the object graph:

    Map<String, CollectionComparisonStrategy> strategiesByPath = new HashMap<String, CollectionComparisonStrategy>();
      strategiesByPath.put("/termRefs", new IndexedComparisonStrategy());
      strategiesByPath.put("/babSet", new UnorderedCollectionComparisonStrategy());
      strategiesByPath.put("/assSet", new CustomKeyCollectionComparisonStrategy(new Key("i")));
      ComparisonStrategyResolver resolver = new ComparisonStrategyResolver(strategiesByPath);
      differ.setComparisonStrategyResolver(resolver);

And the path is matched against collection attributes too:

    strategiesByPath.put("/ccc.termRefs", new UnorderedCollectionComparisonStrategy());

where ccc is collection and ternRefs a collection too. And path is matched against any of termRefs Implementation is available: https://github.com/takacsot/toolbox/tree/master/src/main/java/eu/qualityontime/diff

Test case: https://github.com/takacsot/toolbox/blob/master/src/test/java/eu/qualityontime/diff/ObjectDifferTest.java

takacsot avatar Sep 30 '14 17:09 takacsot

Good job! Especially the ordered strategy raises some interesting questions. Your current implementation reports positional changes as REMOVE or ADD. However, I think for a more generic approach it would be useful to indicate a MOVE, in order to run a fine gained comparison against the moved objects.

Here is the scenario I have in mind: image someone who uses the diff to trigger some business logic like removing an object from the database on REMOVE or triggering a cascade of events on ADD. Unless the object has really been removed or added respectively, these actions could be pretty harmful. A special MOVE state would make it clear that the item still exists, but its position in the collection has changed.

Unfortunately a MOVE isn't exclusive, since the object could also be CHANGED. The Node API would need some work in order to transport this information. Either via multi-states or probably with some kind of meta state. Currently I can't imagine either one of them, although I lean towards multi-state.

SQiShER avatar Sep 30 '14 21:09 SQiShER

Well, My opinion is that a simple index-by-index comparison and a move detection is not the same. I understand your point. I think you should consider move-detection-strategy as a simple use selected alternative.

e.g. I my specialized need index based strategy is a good choice because the order of item is so important that a MOVE detection would not give value.

On the other hand MOVE detection strategy could not give answer to the following.

Given original List(1,1,2,2)

Given working List(1,1,3,2,2)

What kind of movement to be detected?

  • 2 is moved from index 2 to index 4
  • 2 is moved from index 2 to index 3 AND 2 is moved from index 3 to index 4

Both answer is correct. It highly depending on the need of end user (developer).

Summary: the more strategy we have the better the product is because we could choose what we need.

takacsot avatar Oct 01 '14 13:10 takacsot

(As requested in the docs, I gladly add myself to the list of people that would love to have this implemented. Trying to diff deeply nested objects like http://www.hl7.org/implement/standards/fhir/patient.html seems not doable right now.)

bachi76 avatar Dec 05 '14 20:12 bachi76

I would also be very interesting in this type of functionality.

ctmay4 avatar Feb 24 '15 16:02 ctmay4

My primary use for this library would be for regression testing when migrating/re-implementing services. Currently the object to compare are multi-level collections hierarchy represented by XML. So I thought a recursive approach of using ComparisonStrategy would be useful.

    private static class ListComparisonStrategy implements ComparisonStrategy {

        public void compare(DiffNode node, Class<?> type, Object working, Object base) {
            if (working == null && base == null) {
                node.setState(State.UNTOUCHED);
            } else if (working == null && base != null) {
                node.setState(State.REMOVED);
            } else if (working != null && base == null) {
                node.setState(State.ADDED);
            } else if (((Collection) working).size() != ((Collection) base).size()) {
                node.setState(State.CHANGED);
            } else {
                Iterator iterWorking = ((List) working).iterator();             
                Iterator iterbase = ((List) base).iterator();
                while (iterbase.hasNext()) {
                    Object w = iterWorking.next();
                    Object b = iterbase.next();
                    ObjectDifferBuilder.startBuilding()
                        .comparison()
                        .ofType(ArrayList.class)
                        .toUse(new CollectionComparisonStrategy())
                        .and()
                        .build()
                        .compare(w, b)
                        .visit(new PrintingVisitor(w, b));
                }
            }
        }
    }

This allows me to recursively use ComparisonStrategy for Lists (ordered Collections).

potbunker avatar Sep 02 '15 13:09 potbunker