html-agility-pack icon indicating copy to clipboard operation
html-agility-pack copied to clipboard

No longer reporting ParseErrors for Invalid HTML

Open brianfeucht opened this issue 6 years ago • 4 comments

I was able to reproduce this via this test:

    [TestFixture]
    public class OptionInBodyCreatesParseError
    {
        [Test(Description = "Ensures ParseError is created for Option in Body")]
        public void EnsureParseErrorIsCreatedForOptionInBody()
        {
            var html = "<html><body><option></option></body></html>";
            var doc = new HtmlDocument();

            doc.LoadHtml(html);

            Assert.That(doc.ParseErrors.Count(), Is.EqualTo(1));
        }
    }

Running this test through the w3c validator confirms this is not valid HTML. Previous version of HAP would return errors via ParseErrors property. This is no longer the case with the latest package on Nuget and the latest code in master.

brianfeucht avatar Dec 05 '17 19:12 brianfeucht

Hello @brianfeucht ,

Thank you for reporting,

We will look how option and select was validated before we applied our latest changes.

Best Regards,

Jonathan


Help us to support this library: Donate

JonathanMagnan avatar Dec 05 '17 21:12 JonathanMagnan

Hello @brianfeucht ,

The previous version of HAP was returning the following error: End tag </option> is not required

The expected error is more about the SELECT tag is missing.

I'm not sure for now if we want to start to handle this kind of error or not since this library doesn't really currently handle it.

We will for sure try to do it in 2018, but I will mark this request as nice to have for now.

Best Regards,

Jonathan


Help us to support this library: Donate

JonathanMagnan avatar Dec 08 '17 19:12 JonathanMagnan

@JonathanMagnan for what is it worth, build 1.5.5 returns a parsing error for these cases.

This is blocking us from converting projects to .NET Standard. Something about the 1.5.5 build causes us to get this error after upgrading a dependent project to .NET Standard 2.0 (the project throwing this has the HtmlAgilityPack added as a nuget package):

error CS0012: The type 'HtmlNode' is defined in an assembly that is not referenced. You must add a reference to assembly 'HtmlAgilityPack, Version=1.5.5.0, Culture=neutral, PublicKeyToken=null'.

After upgrading to 1.6.6 the build error goes away but our unit tests depending on ParseErrors start failing.

I'm happy to help fix this. Can you summarize why this stopped working between build 1.5.5 and 1.6.6?

brianfeucht avatar Dec 08 '17 19:12 brianfeucht

Sorry I reread your previous comment. I'll see what I can do from here to change our unit tests then

brianfeucht avatar Dec 08 '17 19:12 brianfeucht