html-agility-pack
html-agility-pack copied to clipboard
Closing Paragraph tag removed when no content provided
Related to issue #1
Encountered in v1.5.1 - When parsing a P block with no content, HAP will remove the closing P. For example:
Starting HTML: <p style="font-size:1px;"></p><p>test</p>
HTML after HAP: <p style="font-size:1px;"><p>test</p>
Using HtmlAgilityPack.HtmlDocument.DisableBehavaiorTagP = true; seem to resolve this issue. Should we expect to continue to use that option in future releases?
Hello @chrisnelsondotca ,
I'm currently looking at some issue similar to this one,
I hope to be able to provide more information by next Monday.
Best Regards,
Jonathan
Hello @chrisnelsondotca ,
We will try to work on all this kind of issue in September.
In a future release, you will not have to keep using this option. We will probably directly remove it once all this kind of issue is fixed.
We don't have yet a fixed date for it but I will try to keep you updated once we will start to work on this issue.
Best Regards,
Jonathan
Has the behaviour of this changed? Because DisableBehavaiorTagP = true
still seems to remove closing P tags for me. I'm on version 1.7.1
HtmlDocument.DisableBehavaiorTagP = true;
var htmlDocument = new HtmlDocument();
const string testHtml = "<p>before<div>middle</div>after</p>";
htmlDocument.LoadHtml(testHtml);
var divNode = htmlDocument.DocumentNode.SelectSingleNode("/p/div");
var divParagraph = divNode.ParentNode;
divParagraph.InnerHtml = divParagraph.InnerHtml.Replace(divNode.OuterHtml, "</p>" + divNode.OuterHtml + "<p>");
Console.WriteLine(htmlDocument.DocumentNode.InnerHtml);
Expected result:
<p>before</p><div>middle</div><p>after</p>
Actual result:
<p>before<div>middle</div><p>after</p></p>
Now, this works in a browser, as the <div>
element closes the first <p>
element, but I'm writing something for Facebook Instant Articles, which is very strict, and doesn't want <figure>
elements inside <p>
elements. Facebook being Facebook, it doesn't consider the behaviour of paragraph elements, and simply says the next element is a child.
Hello @Rene-Sackers ,
Thank you for reporting, we will look at it.
We added some methods that allow us more easily to handle this kind of scenario. So perhaps now we can do something about it.
Best Regards,
Jonathan