diffplex icon indicating copy to clipboard operation
diffplex copied to clipboard

Semantics of word by word diff

Open riyadparvez opened this issue 7 years ago • 0 comments

I'm trying to understand the semantics of word to word diff in diffplex w.r.t. wdiff.

Output from wdiff:

➜ git:(diffplex) ✗ echo "a b" > /tmp/a.txt
➜ git:(diffplex) ✗ echo "a b c" > /tmp/b.txt ➜ git:(diffplex) ✗ wdiff /tmp/a.txt /tmp/b.txt a b {+c+}

{+c+} means only "c" has been inserted.

If I try the same in diffplex word to word diff:

                var prev = "a b";
		var current = "a     b c";
		var differ = new Differ();
		var result = differ.CreateWordDiffs(prev, current, false, new char[] { ' ', '\n'});

		foreach (var block in result.DiffBlocks) 
		{
			Console.WriteLine();
			Console.WriteLine("Insert Start: {0}\nInsert Count: {1}\nDelete Start: {2}\nDelete Count: {3}",
			                  block.InsertStartB, block.InsertCountB, block.DeleteStartA, block.DeleteCountA);
		}

Insert Start: 2 Insert Count: 8 Delete Start: 2 Delete Count: 0

Insert Start: 11 Insert Count: 2 Delete Start: 3 Delete Count: 0

Converting ^^^ output into wdiff format will be like this: a {+ +} b {+c+}. I'm not sure of the semantics of word to word diff in diffplex. But I think output of wdiff is intuitive and desired most of the cases, and diffplex's output is resembles char to char diff. The output of diffplex is same even if I set ignoreWhitespace=true in var result = differ.CreateWordDiffs(prev, current, true, new char[] { ' ', '\n'});. Is it by design?

riyadparvez avatar Mar 20 '17 16:03 riyadparvez