html-agility-pack
html-agility-pack copied to clipboard
HtmlNode.InnerText Not working properly
InnerText returns " "
1. Description
<p id="demo">hello </p>
var text =htmlNode.InnerText;
//Get:
"hello "
//Expected:
"hello "
4. Any further technical details
Add any relevant detail can help us, such as:
- HAP version: 1.8.1
- NET version: .NET 4.0
Hello @MartinHenkeQP ,
Due to some backward compatibility, we choose to let this work like this. Many people use the library to parse text and would have expected to get the
and not a space (even if space is really what should have been expected).
From the past experience, we learned that this small kind of fix generally causes a lot of issues for people using our library in their production environment, so we try as much as possible to don't touch it.
You can on your side fix it by simply decoding the HTML:
HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(@"<p id=""demo""> hello </p>");
var text = HttpUtility.HtmlDecode( htmlDocument.DocumentNode.InnerText);
Best Regards,
Jon
Performance Libraries
context.BulkInsert(list, options => options.BatchSize = 1000);
Entity Framework Extensions • Entity Framework Classic • Bulk Operations • Dapper Plus
Runtime Evaluation
Eval.Execute("x + y", new {x = 1, y = 2}); // return 3
C# Eval Function • SQL Eval Function
Dear Jon,
Thanks for the info. However to simplify the process and to avoid this issue being reported again, perhaps it might be an option to provide another method, e.g. GetInnerText(), and set a comment to the InnerText property to use GetInnerText() to get the decoded text. This would not break compatibility and support developers who need the decoded text.
Best regards,
Martin
Von: Jonathan Magnan [email protected] Gesendet: Montag, 8. März 2021 18:03 An: zzzprojects/html-agility-pack Cc: Martin Henke; Mention Betreff: Re: [zzzprojects/html-agility-pack] HtmlNode.InnerText Not working properly (#427)
Hello @MartinHenkeQPhttps://github.com/MartinHenkeQP ,
Due to some backward compatibility, we choose to let this work like this. Many people use the library to parse text and would have expected to get the and not a space (even if space is really what should have been expected).
From the past experience, we learned that this small kind of fix generally causes a lot of issues for people using our library in their production environment, so we try as much as possible to don't touch it.
You can on your side fix it by simply decoding the HTML:
HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(@"<p id=""demo""> hello
");var text = HttpUtility.HtmlDecode( htmlDocument.DocumentNode.InnerText);
Best Regards,
Jon
Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework Extensionshttp://entityframework-extensions.net/ * Entity Framework Classichttp://entityframework-classic.net/ * Bulk Operationshttp://bulk-operations.net/ * Dapper Plushttp://dapper-plus.net/
Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval Functionhttp://eval-expression.net/ * SQL Eval Functionhttp://eval-sql.net/
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zzzprojects/html-agility-pack/issues/427#issuecomment-792904026, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJG6P53C7GCYENIHL4BBQMTTCT7NJANCNFSM4YPECLXQ.