htmlparser2
htmlparser2 copied to clipboard
Parser incorrectly recognizes (less than) as a starting tag
Parser doesn't check if value after starting tag is a valid HTML tag or not. Parser should check if it's a valid HTML tag only then remove everything after starting tag if no closing tag found.
Taking example from : https://github.com/apostrophecms/sanitize-html/issues/339
If you can find this for <$40, it's a steal! I would highly recommend getting it
after this text is run through sanitize-html
which uses htmlparser2
, the string is truncated to the text before the 'lt' symbol, so the remainder of the text is discarded. Is there a setting I am missing or is this a bug?
Input:
If you can find this for <$40, it's a steal! I would highly recommend getting it
Result:
If you can find this for
Expected:
If you can find this for <$40, it's a steal! I would highly recommend getting it
You must either be using an old version of htmlparser2, or have xmlMode enabled. Current versions of the module will skip over <$
.
@fb55 That one works in the latest.. but have another use case where for internal DB functions considering time dimension, parser is incorrectly recognizing the tag. ( or any word after <
)
Example :
event_time<current_time()
This gets trimmed down to :
event_time
Any idea on what can be a workaround for that ?
@fb55 any update on the above ?