html-agility-pack icon indicating copy to clipboard operation
html-agility-pack copied to clipboard

RSS xml parse link tage

Open smirkchung opened this issue 7 years ago • 4 comments

The 1.6.6 version fix the issue that <option> tag can not get outerHtml, innerHtml, innerText. I very like html-agility-pack. When I parser RSS websit xml, the xml tag exist the same issue. Hop can fix the issue.

smirkchung avatar Dec 14 '17 06:12 smirkchung

Hello @smirkchung ,

Do you think you could provide us an example to get us started on this issue?

Best Regards,

Jonathan


Help us to support this library: Donate

JonathanMagnan avatar Dec 14 '17 18:12 JonathanMagnan

Sure!!!! There is a link of bbc news. http://feeds.bbci.co.uk/news/world/rss.xml That is my code.

HtmlDocument doc = new HtmlDocument();`
string rssXml = File.ReadAllText(@"rss.xml");`
doc.LoadHtml(rssXml);`
var link = doc.DocumentNode.SelectNodes("//item/link");`

Other issue about the tag 'link', In HTML the tag has no end tag. In XHTML the tag must be properly closed. Hope , HAP can parse both 'link' tags (no end tag and end tag)

I disable the line ElementsFlags.Add("link", HtmlElementFlag.Empty) at HtmlNode.cs It is work. But I am not sure that is a good idea.

Thanks very mutch. Look forward to the new version If any problem about example, tell me

smirkchung avatar Dec 18 '17 10:12 smirkchung

To ignore about the link that has no end tag. The html tag does not have the end tag, that is long time ago Thanks

neo5657 avatar Dec 22 '17 08:12 neo5657

Hello @smirkchung ,

Unfortunately, HAP is currently only an HTML browser. Not an XML.

We will for sure try to be more flexible when we will re-write this library but after taking some time to look at this request, we choose to skip it for now and concentrate with current bug reported about HTML.

The request has been added in our task list but I don't think we will make it soon.

Best Regards,

Jonathan


Help us to support this library: Donate

JonathanMagnan avatar Dec 22 '17 22:12 JonathanMagnan