php-htmldiff
php-htmldiff copied to clipboard
Warning: DOMDocument::loadHTML(): Unexpected end tag : u in Entity
We sometimes get this "Unexpected end tag" problem, and this is how to reproduce it. The following PHP file is very sensitive to spaces, so make sure each and every space is copied correct.y The above warning seems to go away when we use the "keep new lines" config option and remove all the spaces.
<html>
<p>This code fails. To get it working, remove one space before the "ol" tag on line 31, which is just under "...Something here..." in $newHtml</p>
<?php
$oldHtml = '<ol>
<li><u>Publication:</u>
<ol>
<li>This sentence.</li>
</ol>
</li>
<li><u>Something here</u>:
<ol>
<li>Another item</li>
</ol>
</li>
</ol>
<ol>
<li><u>Mars</u>:</li>
<li>Saturn</li>
</ol>';
$newHtml = '<ol>
<li><u>Publication:</u>
<ol>
<li>This sentence.</li>
</ol>
</li>
<li><u>Something here</u>:
<ol>
<li>Another item</li>
</ol>
</li>
<li><u>Mars</u>:
<ol>
<li>Saturn</li>
</ol>
</li>
</ol>';
error_reporting(E_ALL);
ini_set('display_errors', '1');
require __DIR__ . '/../vendor/autoload.php';
use Caxy\HtmlDiff\HtmlDiff;
use Caxy\HtmlDiff\HtmlDiffConfig;
$config = new HtmlDiffConfig();
$config->setKeepNewLines(true);
$htmlDiff = HtmlDiff::create($oldHtml, $newHtml, $config);
$content = $htmlDiff->build();
echo "Diff is " . $content;
?>
</html>
We are having same problem: DOMDocument::loadHTML(): Tag mark invalid in Entity. I found that this happens because Caxy\HtmlDiff\ListDiffLines::listByLines()
method uses DOMDocument::loadHTML()
and as far as I know libxml 2.6+ works wrong with HTML5 tags. Actually this is very spread issue.
I think XML errors suppressing could be used there using libxml_use_internal_errors(true);
and libxml_use_internal_errors(false);
after loadHTML()
was done.
I will try to investigate this issue deeper and write a PR but it seems that no one are working on this repo. So there is almost no chance that my corrections will be accepted.
@MykhailoSukovitsyn If you get a PR open for this, we will review and merge