htmldiff.net
htmldiff.net copied to clipboard
Using AddBlockExpression for table rows
Hello dear Rohland, Thank you for this project.
I understand, that this project is not ideal for difficult html, but I`m trying to improve it... :)
So, I found the way to add Block Expressions to find differences in the table rows by:
AddBlockExpression(new Regex(@"<tr(.|\n)*?>(.|\n)*?<\/tr>", RegexOptions.IgnoreCase | RegexOptions.Multiline));
But this code may to select several tr rows for diff.
Any decision to make it for each tr row separately?
Hi @Emins
Sorry, I don't have much time to look into this, but I think it comes down to the Regex being used.
You could try this:
<tr[^>]*>.*?</tr>
Let me knnow if this captures each row separately.
Thank you for reply, @Rohland
Your regex don`t match rows. But, I have no problem with tr regex. This regex correct detect row as the each "Block":
AddBlockExpression(new Regex(@"<tr(.|\n)*?>(.|\n)*?<\/tr>", RegexOptions.IgnoreCase | RegexOptions.Multiline));
and correct detect changes, for ex in the html code:
<tr>
<td>
<p style="text-align: left;">aaa bbb</x:p>
</td>
</tr>
<tr>
<td>
<p style="text-align: left;">aaa ccc</x:p>
</td>
</tr>
Issue with marking, when 2 rows changed. Now 2 rows was marked del and 2 rows marked ins. I want, 1 del, 1 ins, 1 del, 1 ins.
I thing that issue in the core, in the main conception, because code search for Matching Blocks. And 2 BlockExpression detects as List of changed blocks.
I just found one way, not a good solution, but may be will be helpful for someone. When Inserting replace tags can be added by ins and del order. In the Diff.cs find function:
private void ProcessReplaceOperation(Operation operation)
{
ProcessDeleteOperation(operation, "diffmod");
ProcessInsertOperation(operation, "diffmod");
}
replace with:
private void ProcessReplaceOperation(Operation operation)
{
//// Test Code to make delete and inserted pairly, if BlockExpression
////
List<string> text1 = _oldWords.Where((s, pos) => pos >= operation.StartInOld && pos < operation.EndInOld).ToList();
List<string> text2 = _newWords.Where((s, pos) => pos >= operation.StartInNew && pos < operation.EndInNew).ToList();
if (!text1.FirstOrDefault()?.Contains("<tr") == true) // todo, improve
{
ProcessDeleteOperation(operation, "diffmod");
ProcessInsertOperation(operation, "diffmod");
}
else
{
var maxCount = text1.Count > text2.Count ? text1.Count : text2.Count;
for (int i = 0; i < maxCount; i++)
{
if (text1.ElementAtOrDefault(i) != null)
InsertTag(DeleteTagValue, "diffmod", new List<string>() { text1.ElementAt(i) });
if (text2.ElementAtOrDefault(i) != null)
InsertTag(InsertTagValue, "diffmod", new List<string>() { text2.ElementAt(i) });
}
}
}