html-agility-pack
html-agility-pack copied to clipboard
SelectNodes ignores 'empty' tags
Not sure if I'm doing something wrong, but when I use SelectNodes with a fairly simple XPath, Agility 'ignores' tags that don't contain text (InnerText): ...SelectNodes("//div/ul/li/span") <span ...> <<<--- This one is being ignored <span ...>This is a test <<<--- This one is fine
FYI - The spans always appear in pairs, but I need to check a class name in the first (textless) one (...and yes, there's a workaround, but it is a bit cumbersome...)
Hello @ganr8790 ,
Could you provide me an example that's not working?
The following one return 5 nodes for me
var html = @"
<div>
<ul>
<li><span></li>
<li><span /></li>
<li><span></span></li>
<li><span>a</li>
<li><span>b</span></li>
</ul>
</div>
";
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode.SelectNodes("//div/ul/li/span")
.ToList();
Best Regards,
Jonathan
Hi Jon
The HTML is slightly different…
- Text Here… Text Here… Text Here…
I was expecting a list that returns the outer Spans, but I only get the inner ones (which I thought will only be the case if the xpath I use is: "//div/ul/li/span/span")
My workaround is to search for the
I look for the value ‘myclass…’
Regards
Rami
From: Jonathan Magnan [mailto:[email protected]] Sent: 23 August 2017 22:46 To: zzzprojects/html-agility-pack [email protected] Cc: VaderUK [email protected]; Mention [email protected] Subject: Re: [zzzprojects/html-agility-pack] SelectNodes ignores 'empty' tags (#68)
Hello @ganr8790 https://github.com/ganr8790 ,
Could you provide me an example that's not working?
The following one return 5 nodes for me
var html = @"
- a
- b
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode.SelectNodes("//div/ul/li/span") .ToList();
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/68#issuecomment-324471730 , or mute the thread https://github.com/notifications/unsubscribe-auth/AZrIUQh6pSBWuLEOVQRvjd9dbHxDcUUmks5sbJ2pgaJpZM4PAmGq . https://github.com/notifications/beacon/AZrIUXiqQzJvXTI0GmDDprXTEb6p_Tvkks5sbJ2pgaJpZM4PAmGq.gif
Which version are you using?
The code return me the outer span without problem with the innerHTML <span>Text Here…</span>
and the right class name.
Best Regards,
Jonathan
v1.5.1 – 6 Jul 2017 (I’m using VS2017 with the latest update)
Thanks – I’m off…
From: Jonathan Magnan [mailto:[email protected]] Sent: 24 August 2017 00:09 To: zzzprojects/html-agility-pack [email protected] Cc: VaderUK [email protected]; Mention [email protected] Subject: Re: [zzzprojects/html-agility-pack] SelectNodes ignores 'empty' tags (#68)
Which version are you using?
The code return me the outer span without problem with the innerHTML Text Here… and the right class name.
Best Regards,
Jonathan
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/68#issuecomment-324487724 , or mute the thread https://github.com/notifications/unsubscribe-auth/AZrIUYv4_clpvjNaaw1URDBoqgvDfRNpks5sbLElgaJpZM4PAmGq . https://github.com/notifications/beacon/AZrIUVcHem5a7Eyem0bfG9nyFYP0pOfrks5sbLElgaJpZM4PAmGq.gif
Sorry, it was late – the xpath was actually: “div/ul/li/span”
(the // causes the search to start from the html’s root
irrespective of the current node)
BTW – note that the below is an extract of a large HTML!
.
.
.
<ul>
<li>
<span class=”myclass…”>
<span>Text Here…</span>
</span>
<span class=”myclass…”>
<span>Text Here…</span>
</span>
<span class=”myclass…”>
<span>Text Here…</span>
</span>
</li>
</ul>
.
.
.
Maybe this will give you a clue - when I look at the Watch of the
InnerText of the above xpath query I can see all the ‘Text Here’
without any spaces!
From: Jonathan Magnan [mailto:[email protected]] Sent: 24 August 2017 00:09 To: zzzprojects/html-agility-pack [email protected] Cc: VaderUK [email protected]; Mention [email protected] Subject: Re: [zzzprojects/html-agility-pack] SelectNodes ignores 'empty' tags (#68)
Which version are you using?
The code return me the outer span without problem with the innerHTML Text Here… and the right class name.
Best Regards,
Jonathan
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/68#issuecomment-324487724 , or mute the thread https://github.com/notifications/unsubscribe-auth/AZrIUYv4_clpvjNaaw1URDBoqgvDfRNpks5sbLElgaJpZM4PAmGq .
Hello @ganr8790 ,
The InnerText
only show the inner text (the text, not the HTML tag) which is Text Here…
The InnerHtml
show the inner HTML (All the HTML) which is <span>Text Here...</span>
Perhaps it's only a misunderstanding between both properties?
Best Regards,
Jonathan