FSharp.Data icon indicating copy to clipboard operation
FSharp.Data copied to clipboard

Content from tags inside pre is missing whitespace characters

Open johannesegger opened this issue 7 years ago • 1 comments

When emitting a parsed document whitespace characters are usually preserved for pre tags. The following test checks this.

https://github.com/fsharp/FSharp.Data/blob/33e6e825bc2978eb9ce6dd880f31d9e60d452699/tests/FSharp.Data.Tests/HtmlParser.fs#L763-L772

However that's not the case when there are child tags within the pre tag. As soon as the parser encounters a different tag, whitespace characters are removed. So e.g. when emitting the parsed document from the following snippet, whitespace characters after the span tag are missing. Whitespace before the span tag is not missing.

<pre>\r\n        This <span>code</span> should be indented and\r\n        have line feeds in it</pre>

The code that's responsible for "normalizing" whitespace characters is the following:

https://github.com/fsharp/FSharp.Data/blob/33e6e825bc2978eb9ce6dd880f31d9e60d452699/src/Html/HtmlParser.fs#L373-L382

And x.InsertionMode is calculated as follows:

https://github.com/fsharp/FSharp.Data/blob/49a3bfb22a8955463d7536af1d2df86449e335c6/src/Html/HtmlParser.fs#L353-L356

x.IsFormattedTag is only true if the last parsed tag is pre or code. It should check if it's currently inside a formatted tag, shouldn't it?

johannesegger avatar Oct 25 '18 18:10 johannesegger

Report a related behavior with minimal reproduce steps:

[<EntryPoint>]
let main argv =
    let n = List.exactlyOne (HtmlNode.Parse("""<pre>
%module graphics
%{
#include &lt;GL/gl.h&gt;
#include &lt;GL/glu.h&gt;
%}
    
// Put the rest of the declarations here
...
</pre>"""))
    printfn "%s" (n.InnerText())

would produce:


%module graphics
%{
#include <GL/gl.h> #include <GL/glu.h> %} // Put the rest of the declarations here ...

instead of (expected):


%module graphics
%{
#include <GL/gl.h>
#include <GL/glu.h>
%}
    
// Put the rest of the declarations here
...

hcoona avatar Sep 11 '19 12:09 hcoona