php-htmldiff icon indicating copy to clipboard operation
php-htmldiff copied to clipboard

Lists being marked as changed when the output is 100% the same.

Open Ambient-Impact opened this issue 3 years ago • 3 comments

I've got the following two blocks of HTML being marked as changed, even though I've diffed them with WinMerge and it tells me they're 100% identical.

Block 1:

<h3>References<a id="References" href="#References" name="References" class="heading-permalink ambientimpact-link-has-image" aria-hidden="true" title="Permalink"><span class="ambientimpact-icon ambientimpact-icon--name-link ambientimpact-icon--bundle-libricons ambientimpact-icon--text-hidden ambientimpact-icon--icon-standalone ambientimpact-icon--is-bundle-loaded ambientimpact-icon--icon-standalone-loaded"><svg class="ambientimpact-icon__icon" viewBox="0 0 24 24" width="24" height="24" aria-hidden="true"><use xlink:href="/modules/ambientimpact/ambientimpact_icon/icons/libricons.svg?qq6uy7#icon-link"></use></svg><span class="ambientimpact-icon__text"><span class="ambientimpact-link-has-image__text">Permalink</span></span></span></a></h3>
<div class="references" role="doc-endnotes"><ol><li class="references__list-item" id="reference-conv" role="doc-endnote"><p>Kierney, L. (May 2029). “Bigger Fish To Fry: An Interview With William Lassgard.” <em>forbes.com</em>.&nbsp;<a class="references__backreference-link" rev="footnote" href="#backreference-conv" role="doc-backlink">↩</a></p></li>
<li class="references__list-item" id="reference-dott" role="doc-endnote"><p>Bridges, C. (August 2012). “Translation of domestication of Thunnus thynnus into an innovative commercial application.” <em>transdott.eu</em>.&nbsp;<a class="references__backreference-link" rev="footnote" href="#backreference-dott" role="doc-backlink">↩</a></p></li>
<li class="references__list-item" id="reference-12" role="doc-endnote"><p>Åkesson, N. (October 2039). “Leaked correspondence between Xu Shaoyong and William Lassgard paints dramatic picture.” <em>Dagens Nyheter</em>.&nbsp;<a class="references__backreference-link" rev="footnote" href="#backreference-12" role="doc-backlink">↩</a></p></li></ol></div></div>

Block 2:

<h3>References<a id="References" href="#References" name="References" class="heading-permalink ambientimpact-link-has-image" aria-hidden="true" title="Permalink"><span class="ambientimpact-icon ambientimpact-icon--name-link ambientimpact-icon--bundle-libricons ambientimpact-icon--text-hidden ambientimpact-icon--icon-standalone ambientimpact-icon--is-bundle-loaded ambientimpact-icon--icon-standalone-loaded"><svg class="ambientimpact-icon__icon" viewBox="0 0 24 24" width="24" height="24" aria-hidden="true"><use xlink:href="/modules/ambientimpact/ambientimpact_icon/icons/libricons.svg?qq6uy7#icon-link"></use></svg><span class="ambientimpact-icon__text"><span class="ambientimpact-link-has-image__text">Permalink</span></span></span></a></h3>
<div class="references" role="doc-endnotes"><ol><li class="references__list-item" id="reference-conv" role="doc-endnote"><p>Kierney, L. (May 2029). “Bigger Fish To Fry: An Interview With William Lassgard.” <em>forbes.com</em>.&nbsp;<a class="references__backreference-link" rev="footnote" href="#backreference-conv" role="doc-backlink">↩</a></p></li>
<li class="references__list-item" id="reference-dott" role="doc-endnote"><p>Bridges, C. (August 2012). “Translation of domestication of Thunnus thynnus into an innovative commercial application.” <em>transdott.eu</em>.&nbsp;<a class="references__backreference-link" rev="footnote" href="#backreference-dott" role="doc-backlink">↩</a></p></li>
<li class="references__list-item" id="reference-12" role="doc-endnote"><p>Åkesson, N. (October 2039). “Leaked correspondence between Xu Shaoyong and William Lassgard paints dramatic picture.” <em>Dagens Nyheter</em>.&nbsp;<a class="references__backreference-link" rev="footnote" href="#backreference-12" role="doc-backlink">↩</a></p></li></ol></div></div>

Could it be due the emoji or the <p> elements? Not sure if the <p> elements are valid nesting, so I'll likely try to remove those, but they're being automatically generated by CommonMark or a Drupal filter.

Ambient-Impact avatar Mar 19 '21 00:03 Ambient-Impact

I think I've figured this out! Turns out that it was indeed the <p> elements that were causing this; removing them seems to have restored the expected behaviour. I also tried adding 'p' to Caxy\HtmlDiff\ListDiffLines::listContentTags which also fixed it, so perhaps this element could be added to that array? According to the MDN entry for <li>, <p> elements and a few others that count as "flow content" are allowed in list items.

@SavageTiger Thoughts?

Ambient-Impact avatar Jul 13 '21 04:07 Ambient-Impact

@Ambient-Impact This was a long time ago, I know, but I'm curious if the

tags were getting removed by the HTML sanitizer / purifier that runs I wonder... as you mentioned in theory

tags should be fine within

  • tags. I feel like adding it to listContentTags would make sense, I'm curious why it wasn't originally 🤔
  • jschroed91 avatar Nov 06 '23 00:11 jschroed91

    This does feel like a lifetime ago. 😂

    I don't know enough about what's handled by this library and what's handled by the purifier, but I'm guessing the oversight is probably because it's not a common thing to want to put <p> inside a <li> intentionally - I don't think I knew it was valid until I looked it up.

    Ambient-Impact avatar Nov 07 '23 16:11 Ambient-Impact