html-agility-pack icon indicating copy to clipboard operation
html-agility-pack copied to clipboard

get page is incomplete

Open snowchenlei opened this issue 3 years ago • 3 comments

var document = await web.LoadFromWebAsync("http://whsggzy.wuhu.gov.cn/whggzyjy/005/005001/005001001/20220819/87d3fb1c-eeb0-457f-b040-3a74593cda1d.html"); However, the content obtained is different from the original web page。 (//*[@id="process-list"]/li[1]) this xpath element date-url attribute is empty

snowchenlei avatar Aug 23 '22 07:08 snowchenlei

Hello @snowchenlei ,

I quickly looked at it, and it might be due because the data is populated after the page is loaded. However the method LoadFromWeb get the source HTML, not the final HTML after the page is rendered

There are 2 way to solve it:

  • Use LoadFromBrowser if you are still with .NET Framework: https://html-agility-pack.net/from-browser
  • Use Selenium library that will allow you to interact with the HTML (so once the page is loaded).

Best Regards,

Jon


Sponsorship Help us improve this library

Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework ExtensionsDapper Plus

Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval Function

JonathanMagnan avatar Aug 23 '22 13:08 JonathanMagnan

I'm using .net core, which is bad

snowchenlei avatar Aug 24 '22 06:08 snowchenlei

So I recommend you using Selenium Web Driver if that's possible: https://riptutorial.com/selenium-webdriver/learn/100000/overview

It works really help and depending on what you want to achieve, might open a lot of opportunities.

JonathanMagnan avatar Aug 24 '22 13:08 JonathanMagnan