html-agility-pack
html-agility-pack copied to clipboard
Parsing error
Description
If an element's attribute value comes like this. e.g. <img src='/image.jpg' alt='Savannah's streets'>
'Savannah's streets' quotes insides quotes will break the html.
output will be <img src='/image.jpg' alt='Savannah' s="" streets'="" >
Exception - Malformed HTML
alt='Savannah' s="" streets'=""
Further technical details
- HAP version: Even for 1.11.7
- NET version (4.6):
Hello @RahulMathew ,
I'm not sure what's the problem here, we got the same behavior when using an HTML inspector directly in a browser.
<img src="/image.jpg" alt="Savannah" s="" streets'="">
The HTML is malformed, so there is not so much we can do and we do pretty much the same thing as a browser do here unless I'm missing something.
Best Regards,
Jonathan
Performance Libraries
context.BulkInsert(list, options => options.BatchSize = 1000);
Entity Framework Extensions • Entity Framework Classic • Bulk Operations • Dapper Plus
Runtime Evaluation
Eval.Execute("x + y", new {x = 1, y = 2}); // return 3
C# Eval Function • SQL Eval Function
Hi,
The input html is correct but after we load the html using the Load method and if we get back the html if you have an attribute like this
e.g.
case 1
string html = "
correct output
-- works fine since the
attribute does not come with apostrophe within single quotes
case 2
string html = "
malformed output
<img src="/image.jpg" alt="Savannah" s="" streets'=""> -- html output from the HtmlAgilityPack will break as it doesnot parse
apostrophe within single quotes and you will get the above output.
'Savannah's streets' -- the apostrophe 's will break the html. It is easy to reproduce this.
Thanks
Rahul Mathew
On Thu, Jun 20, 2019 at 8:54 PM Jonathan Magnan [email protected] wrote:
Hello @RahulMathew https://github.com/RahulMathew ,
I'm not sure what's the problem here, we got the same behavior when using an HTML inspector directly in a browser.
<img src="/image.jpg" alt="Savannah" s="" streets'="">
The HTML is malformed, so there is not so much we can do and we do pretty much the same thing as a browser do here unless I'm missing something.
Best Regards,
Jonathan
Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework Extensions http://entityframework-extensions.net/ • Entity Framework Classic http://entityframework-classic.net/ • Bulk Operations http://bulk-operations.net/ • Dapper Plus http://dapper-plus.net/
Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval Function http://eval-expression.net/ • SQL Eval Function http://eval-sql.net/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/307?email_source=notifications&email_token=AMND3TH45DE3FICIFZJN6Z3P3QRD5A5CNFSM4HZ5GGMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYHDYHA#issuecomment-504249372, or mute the thread https://github.com/notifications/unsubscribe-auth/AMND3TC727NEIVW24XIXRADP3QRD5ANCNFSM4HZ5GGMA .
Hi,
The input html is correct but after we load the html using the Load method and if we get back the html if you have an attribute like this
e.g.
case 1

correct output
-- works fine since the
attribute does not come with apostrophe within single quotes
case 2
<img src="/image.jpg" alt='Savannah's streets' />
malformed output <img src="/image.jpg" alt="Savannah" s="" streets'=""> -- html output from the HtmlAgilityPack will break as it doesnot parse apostrophe within single quotes and you will get the above output. 'Savannah's streets' -- the apostrophe 's will break the html. It is easy to reproduce this.
Thanks
On Thu, Jun 20, 2019 at 9:29 PM Rahul Mathew [email protected] wrote:
Hi,
The input html is correct but after we load the html using the Load method and if we get back the html if you have an attribute like this
e.g.
case 1
string html = "
correct output
-- works fine since the attribute does not come with apostrophe within single quotes
case 2
string html = "
malformed output
<img src="/image.jpg" alt="Savannah" s="" streets'=""> -- html output from the HtmlAgilityPack will break as it doesnot parse
apostrophe within single quotes and you will get the above output.
'Savannah's streets' -- the apostrophe 's will break the html. It is easy to reproduce this.
Thanks
Rahul Mathew
On Thu, Jun 20, 2019 at 8:54 PM Jonathan Magnan [email protected] wrote:
Hello @RahulMathew https://github.com/RahulMathew ,
I'm not sure what's the problem here, we got the same behavior when using an HTML inspector directly in a browser.
<img src="/image.jpg" alt="Savannah" s="" streets'="">
The HTML is malformed, so there is not so much we can do and we do pretty much the same thing as a browser do here unless I'm missing something.
Best Regards,
Jonathan
Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework Extensions http://entityframework-extensions.net/ • Entity Framework Classic http://entityframework-classic.net/ • Bulk Operations http://bulk-operations.net/ • Dapper Plus http://dapper-plus.net/
Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval Function http://eval-expression.net/ • SQL Eval Function http://eval-sql.net/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/307?email_source=notifications&email_token=AMND3TH45DE3FICIFZJN6Z3P3QRD5A5CNFSM4HZ5GGMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYHDYHA#issuecomment-504249372, or mute the thread https://github.com/notifications/unsubscribe-auth/AMND3TC727NEIVW24XIXRADP3QRD5ANCNFSM4HZ5GGMA .
Hi,
I am attaching the bug screenshot .
Thanks Rahul Mathew
On Thu, Jun 20, 2019 at 9:31 PM Rahul Mathew [email protected] wrote:
Hi,
The input html is correct but after we load the html using the Load method and if we get back the html if you have an attribute like this
e.g.
case 1
![]()
correct output
-- works fine since the attribute does not come with apostrophe within single quotes
case 2
<img src="/image.jpg" alt='Savannah's streets' />
malformed output <img src="/image.jpg" alt="Savannah" s="" streets'=""> -- html output from the HtmlAgilityPack will break as it doesnot parse apostrophe within single quotes and you will get the above output. 'Savannah's streets' -- the apostrophe 's will break the html. It is easy to reproduce this.
Thanks
On Thu, Jun 20, 2019 at 9:29 PM Rahul Mathew [email protected] wrote:
Hi,
The input html is correct but after we load the html using the Load method and if we get back the html if you have an attribute like this
e.g.
case 1
string html = "
correct output
-- works fine since the attribute does not come with apostrophe within single quotes
case 2
string html = "
malformed output
<img src="/image.jpg" alt="Savannah" s="" streets'=""> -- html output from the HtmlAgilityPack will break as it doesnot parse
apostrophe within single quotes and you will get the above output.
'Savannah's streets' -- the apostrophe 's will break the html. It is easy to reproduce this.
Thanks
Rahul Mathew
On Thu, Jun 20, 2019 at 8:54 PM Jonathan Magnan [email protected] wrote:
Hello @RahulMathew https://github.com/RahulMathew ,
I'm not sure what's the problem here, we got the same behavior when using an HTML inspector directly in a browser.
<img src="/image.jpg" alt="Savannah" s="" streets'="">
The HTML is malformed, so there is not so much we can do and we do pretty much the same thing as a browser do here unless I'm missing something.
Best Regards,
Jonathan
Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework Extensions http://entityframework-extensions.net/ • Entity Framework Classic http://entityframework-classic.net/ • Bulk Operations http://bulk-operations.net/ • Dapper Plus http://dapper-plus.net/
Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval Function http://eval-expression.net/ • SQL Eval Function http://eval-sql.net/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/307?email_source=notifications&email_token=AMND3TH45DE3FICIFZJN6Z3P3QRD5A5CNFSM4HZ5GGMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYHDYHA#issuecomment-504249372, or mute the thread https://github.com/notifications/unsubscribe-auth/AMND3TC727NEIVW24XIXRADP3QRD5ANCNFSM4HZ5GGMA .
I believe if you missed to attach your screenshot correctly ;)
Hi,
Please check the email inbox it has the screen shot in case you couldnt find it in the github.
Thanks Rahul Mathew
On Fri, Jun 21, 2019 at 6:16 AM Jonathan Magnan [email protected] wrote:
I believe if you missed to attach your screenshot correctly ;)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/307?email_source=notifications&email_token=AMND3TA55DUL3KUV2XCWCILP3SS6NA5CNFSM4HZ5GGMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYICEZQ#issuecomment-504373862, or mute the thread https://github.com/notifications/unsubscribe-auth/AMND3TCF4V5RQSZOJGNYSUDP3SS6NANCNFSM4HZ5GGMA .