[BUG] HTMLAGILITY pack await web.LoadFromWebAsync not working on Windows Servers 2016 and 2019
1. Description
I am getting no response and it acts as if nothing is happening. Tried using https://github.com/zzzprojects/html-agility-pack/issues/171 async in this thread and it just sits there blinking and does nothing.
** var web = new HtmlWeb(); web.UsingCache = false; web.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"; var doc = await web.LoadFromWebAsync(page);** tried without useragent but no different.
2. Exception
No exceptions and i cannot debug as i dont have the right equipment on the servers to do that. works fine on windows 10.
Exception message:
no exceptions.
3. Fiddle or Project
unable to but i see the traffic spike it looks like i even receive the data but nothing is shown.
4. Any further technical details
Add any relevant detail can help us, such as:
- HAP version:1.11.34.0
- NET version (4.6.1)
Hello @broomop ,
Everything worked when I tried it.
I recommend you to try with ConfigureAwait(false):
await web.LoadFromWebAsync(html).ConfigureAwait(false)
Depending on the type of application, it might be required to avoid some thread deadlock.
Best Regards,
Jon
Sponsorship Help us improve this library
Performance Libraries
context.BulkInsert(list, options => options.BatchSize = 1000);
Entity Framework Extensions • Bulk Operations • Dapper Plus
Runtime Evaluation
Eval.Execute("x + y", new {x = 1, y = 2}); // return 3
C# Eval Function • SQL Eval Function
you tried this on windows server 2016 or 2019?? i had no issues on windows 10 just the server editions.
Yes,
The test was on windows server 2016
It might also be caused by some security policy on your side.
The library is using an HttpClient: https://github.com/zzzprojects/html-agility-pack/blob/08694be2d81e552ec87e19082396f5d57d8832c2/src/HtmlAgilityPack.Shared/HtmlWeb.cs#L2364
So perhaps you could try to grab the text on your side and simply make HAP parsing it after. Unfortunately, I don't see really anything that we could change that could help you ;(
hi it seems that httpclient isn't liked very much anymore. I am trying to see if i can unlock the http client supposedly its to do with asp.net and using web.config and allowing any logins etc... if you can help any further on this that would be great otherwise thanks for your help.
After reading some more someone mentioned the httpclient is not threadsafe the way it is. and should have a httpresponsemessage used as well:
https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpclient?view=netframework-4.7.2
I have figured that you have to code it exactly like this and wait each time on a windows server:
try
{
GetHtmlDocumentAsync().GetAwaiter().GetResult();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
try
{
HtmlDocument test = GetHtmlDocument();
Console.WriteLine(test.Text);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
Console.ReadLine();
}
async public static Task<HtmlDocument> GetHtmlDocumentAsync()
{
HtmlWeb web = new HtmlWeb();
return await web.LoadFromWebAsync(url);
}
public static HtmlDocument GetHtmlDocument()
{
HtmlWeb web = new HtmlWeb();
return web.Load(url);
}`
instead of just doing loads of await web.LoadFromWebAsync(url); with no handling.
could i also ask how would you do some sort of threaded method so that my application does not lock up?
Hello @broomop ,
Thank you for the information about the HttpClient not being thread-safe.
I really recommend you to grab the HTML on your side in this case and just use the LoadHtml method from the HtmlDocument to parse it.
The way HAP has been built doesn't currently work with a static HttpClient. So there is some issue that we need to speak about here first to determine how we want to solve this.
However, you can already solve it on your side by using all the best practices you already find out and take the HTML.