htmlstream-rust icon indicating copy to clipboard operation
htmlstream-rust copied to clipboard

HTMLTagAttribute does not parse correctly when it reaches a newline

Open MutantOctopus opened this issue 8 years ago • 0 comments

While using this crate I have encountered the following problem:

I have an a href tag which is broken up across two lines. When parsed into an HTMLTag, its {:?} output is as follows:

HTMLTag { name: "a", html: "<a href=\n        \"http://example.domain\">", attributes: "href=\n        \"http://example.domain\"", state: Opening }

As you can see, there is a newline and considerable white space between the href= and the URL in question.

When you attr_iter over this tag's html field, the HTMLTagAttribute associated with the href attribute is as follows:

HTMLTagAttribute { name: "href", value: "" }

The value is an empty string, when it should be http://example.domain.

Despite being uncommon, this apparently is valid HTML and should be supported by this crate.

I may take a look and see if I can make a fork & PR tomorrow that will solve this problem.

MutantOctopus avatar Mar 04 '18 08:03 MutantOctopus