html-agility-pack icon indicating copy to clipboard operation
html-agility-pack copied to clipboard

Creating empty value for attributte

Open joffremota opened this issue 1 year ago • 8 comments

1. Description

I've got the following value <a download href=\"/downloads/arquivo.zip\">Download do Arquivo</a>

When I pass this into the following method in order to add the target="_blank" attributte, I'm getting this: <a download="" href=\"/downloads/arquivo.zip\" target=\"_blank\">Download do Arquivo</a>

How can I prevent the lib to add the empty value on download?

Here is the method.

        private string UpdateAnchorTagsWithTargetBlank(string html)
        {
            var doc = new HtmlDocument();

            doc.LoadHtml(html);
            var anchorNodes = doc.DocumentNode.SelectNodes("//a[@href]");
            if (anchorNodes != null)
            {
                foreach (var node in anchorNodes)
                {
                    if (node.GetAttributeValue("target", "") != "_blank")
                        node.SetAttributeValue("target", "_blank");
                }
            }

            return doc.DocumentNode.OuterHtml;
        }

2. Exception

Not applicable

3. Fiddle or Project

Not applicable

4. Any further technical details

  • HtmlAgilityPack (1.11.40)
  • SDK Version: 6.0.404

joffremota avatar Jun 05 '24 21:06 joffremota

It's not so obvious, but to get the desired behavior you have to configure HtmlDocument.GlobalAttributeValueQuote to use AttributeValueQuote.Initial, i.e.:

var doc = new HtmlDocument()
{
    GlobalAttributeValueQuote = AttributeValueQuote.Initial
};

(This could also be done after loading an HTML document.)



EDIT: I just noticed that the newly created target attribute will have single-quotes when setting up the HtmlDocument instance with AttributeValueQuote.Initial and it's impossible to change this by fiddling with the HtmlAttribute's QuoteType property. Dang! If you can't tolerate single quotes, my suggested solution isn't acceptable, unfortunately.

The problem is the internal field HtmlAttribute.InternalQuoteType being left untouched for newly created attributes and therefore initialized with the default value (which is SingleQuote). Either the cause is the untouched HtmlAttribute.InternalQuoteType field itself or this if expression is borked: https://github.com/zzzprojects/html-agility-pack/blob/8efd5da60c9d6e70939dbd3c58b0f08ebcfa0ad7/src/HtmlAgilityPack.Shared/HtmlNode.cs#L2352-L2355

elgonzo avatar Jun 05 '24 22:06 elgonzo

Thank you @elgonzo ,

Indeed to keep attribute, your proposed solution is perfect: doc.GlobalAttributeValueQuote = AttributeValueQuote.Initial;

As for the SingleQuote problem, I guess the only way at this moment is to use reflection to set the value to DoubleQuote.

Such as:

var html = "<a download href=\"/downloads/arquivo.zip\">Download do Arquivo</a>";
var doc = new HtmlDocument();
doc.GlobalAttributeValueQuote = AttributeValueQuote.Initial;
doc.LoadHtml(html);

var anchorNodes = doc.DocumentNode.SelectNodes("//a[@href]");
if (anchorNodes != null)
{
	foreach (var node in anchorNodes)
	{
		if (node.GetAttributeValue("target", "") != "_blank")
		{
			node.SetAttributeValue("target", "_blank");
			var targetAttribute = node.GetAttributes("target").Single();
			var internalQuoteTypeProperty = typeof(HtmlAgilityPack.HtmlAttribute).GetProperty("InternalQuoteType", System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
			internalQuoteTypeProperty.SetValue(targetAttribute, AttributeValueQuote.DoubleQuote);
		}                            
	}
}

var outputHtml = doc.DocumentNode.OuterHtml;

Best Regards,

Jon

JonathanMagnan avatar Jun 06 '24 13:06 JonathanMagnan

Hi @JonathanMagnan ,

I just commited a PR to propose a correction for this issue, can you please take a look, I am facing the same issue and need this to be fixed in my system :).

Thanks in advance and best regards POFerro

POFerro avatar Jun 21 '24 13:06 POFerro

Thank you @POFerro for your PR.

I will try to look at it very soon.

Best Regards,

Jon

JonathanMagnan avatar Jun 22 '24 15:06 JonathanMagnan

Hi @JonathanMagnan ,

Any news? :)

POFerro avatar Jul 16 '24 14:07 POFerro

Hello @POFerro ,

Sorry for the delay. I didn't say it, but I have been on vacation since June 25 (a few days after your PR).

I'm returning tomorrow, so I will look at it and merge it if accepted next week.

Best Regards,

Jon

JonathanMagnan avatar Jul 16 '24 15:07 JonathanMagnan

Hello @POFerro ,

Thank you again for your pull request. It has been merged and released in the version v1.11.62

Honestly, I'm always afraid of side impacts that will cause other developers as now the download doesn't have a double quote anymore, but I guess we will see if some people report this new behavior as an issue or not in the following weeks.

@joffremota , could you confirm it indeed fixed your issue as well? It seems to work flawlessly on my side.

Best Regards,

Jon

JonathanMagnan avatar Jul 31 '24 15:07 JonathanMagnan

Hi @JonathanMagnan

Thanks for accepting the PR. I already tested in my case and works like a charm.

Thanks and best regards ;) POFerro

POFerro avatar Aug 09 '24 15:08 POFerro