html-agility-pack
html-agility-pack copied to clipboard
RSS xml parse link tage
The 1.6.6 version fix the issue that <option>
tag can not get outerHtml, innerHtml, innerText.
I very like html-agility-pack.
When I parser RSS websit xml, the xml tag exist the same issue.
Hop can fix the issue.
Hello @smirkchung ,
Do you think you could provide us an example to get us started on this issue?
Best Regards,
Jonathan
Help us to support this library: Donate
Sure!!!! There is a link of bbc news. http://feeds.bbci.co.uk/news/world/rss.xml That is my code.
HtmlDocument doc = new HtmlDocument();`
string rssXml = File.ReadAllText(@"rss.xml");`
doc.LoadHtml(rssXml);`
var link = doc.DocumentNode.SelectNodes("//item/link");`
Other issue about the tag 'link', In HTML the tag has no end tag. In XHTML the tag must be properly closed. Hope , HAP can parse both 'link' tags (no end tag and end tag)
I disable the line ElementsFlags.Add("link", HtmlElementFlag.Empty)
at HtmlNode.cs
It is work. But I am not sure that is a good idea.
Thanks very mutch. Look forward to the new version If any problem about example, tell me
To ignore about the link that has no end tag. The html tag does not have the end tag, that is long time ago Thanks
Hello @smirkchung ,
Unfortunately, HAP
is currently only an HTML
browser. Not an XML
.
We will for sure try to be more flexible when we will re-write this library but after taking some time to look at this request, we choose to skip it for now and concentrate with current bug reported about HTML
.
The request has been added in our task list but I don't think we will make it soon.
Best Regards,
Jonathan
Help us to support this library: Donate