php-htmldiff
php-htmldiff copied to clipboard
Spaces near quotes etc.
Looks like added spaces are missing in diff output.
$htmlOld = 'He said:"OK!"';
$htmlNew = 'He said: "OK!"';
$htmlDiff = new \Caxy\HtmlDiff\HtmlDiff($htmlOld, $htmlNew);
echo $htmlDiff->build();
prints
He said:"OK!"
while I expected highlighted space after colon. Using v0.1.14.
I can confirm this. It seems to be a regression between version 0.1.10
and 0.1.11
Still happening.
I came across a similar issue with spaces.
Comparing these two strings:
<strong>this </strong>is<strong> a string.</strong>
<strong>this</strong>is<strong>a string.</strong>
shows no differences.
I've investigated the issue and it seems the code intentionally ignores inserted/removed spaces. It makes sense for spaces between block elements, or where text is not expected like between </li>
and </ul>
. But they shouldn't be ignored when they're inside block and inline elements that accept text as content.
Ignoring spaces based on the HTML context doesn't seem easy. I guess older versions chose to not ignore any spaces, thus they would show changes where they shouldn't.
Maybe running an HTML parser that would remove invisible spaces, like the ones used for indentation, before running HtmlDiff would be the easiest and cleanest way. This could be a requirement for using HtmlDiff. It would make it easier because then we wouldn't need to ignore any spaces.
I continue to experience the same problem even after encoding spaces to UTF-8. I attempted various strategies, including replacing spaces with a custom tag, but the comparison still does not recognize the custom tag as a difference.
The pull request has been merged with a potential solution for this, although it's not all-encompassing -- see notes on PR #111 for details on the new config option and caveats with it