Mishandling of spaces in HtmlNode.ToString ?
I have some HTML that Firefox renders like this:
If I "round-trip" this in FSharp.Data, then the output renders like this:
Here is the HTML:
<pre class="shiki vitesse-light" style="background-color:#ffffff;color:#393a34" tabindex="0"><code><span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> input</span><span style="color:#1E754F"> =</span><span style="color:#B5695999"> "</span><span style="color:#B56959">123</span><span style="color:#B5695999">"</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> intOfDigit </span><span style="color:#1E754F">(</span><span style="color:#B07D48">x </span><span style="color:#1E754F">:</span><span style="color:#2E8F82"> char</span><span style="color:#1E754F">)</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34"> int x </span><span style="color:#1E754F">-</span><span style="color:#393A34"> int </span><span style="color:#B56959">'0'</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> number</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34"> input</span></span>
<span class="line"><span style="color:#1E754F"> |></span><span style="color:#393A34"> Seq.fold</span></span>
<span class="line"><span style="color:#1E754F"> (fun</span><span style="color:#B07D48"> state next </span><span style="color:#1E754F">-></span><span style="color:#393A34"> state </span><span style="color:#1E754F">*</span><span style="color:#2F798A"> 10</span><span style="color:#1E754F"> +</span><span style="color:#393A34"> intOfDigit next</span><span style="color:#1E754F">)</span></span>
<span class="line"><span style="color:#2F798A"> 0</span></span>
<span class="line"></span>
<span class="line"><span style="color:#393A34">printfn $</span><span style="color:#B5695999">"</span><span style="color:#1E754F">%i</span><span style="color:#B56959">{number}</span><span style="color:#B5695999">"</span><span style="color:#A0ADA0"> // 123</span></span>
<span class="line"></span></code></pre>
Here is a repro script:
#r "nuget: FSharp.Data, 6.4.0"
open FSharp.Data
let inputHtml = """<pre class="shiki vitesse-light" style="background-color:#ffffff;color:#393a34" tabindex="0"><code><span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> input</span><span style="color:#1E754F"> =</span><span style="color:#B5695999"> "</span><span style="color:#B56959">123</span><span style="color:#B5695999">"</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> intOfDigit </span><span style="color:#1E754F">(</span><span style="color:#B07D48">x </span><span style="color:#1E754F">:</span><span style="color:#2E8F82"> char</span><span style="color:#1E754F">)</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34"> int x </span><span style="color:#1E754F">-</span><span style="color:#393A34"> int </span><span style="color:#B56959">'0'</span></span>
<span class="line"></span>
<span class="line"><span style="color:#1E754F">let</span><span style="color:#B07D48"> number</span><span style="color:#1E754F"> =</span></span>
<span class="line"><span style="color:#393A34"> input</span></span>
<span class="line"><span style="color:#1E754F"> |></span><span style="color:#393A34"> Seq.fold</span></span>
<span class="line"><span style="color:#1E754F"> (fun</span><span style="color:#B07D48"> state next </span><span style="color:#1E754F">-></span><span style="color:#393A34"> state </span><span style="color:#1E754F">*</span><span style="color:#2F798A"> 10</span><span style="color:#1E754F"> +</span><span style="color:#393A34"> intOfDigit next</span><span style="color:#1E754F">)</span></span>
<span class="line"><span style="color:#2F798A"> 0</span></span>
<span class="line"></span>
<span class="line"><span style="color:#393A34">printfn $</span><span style="color:#B5695999">"</span><span style="color:#1E754F">%i</span><span style="color:#B56959">{number}</span><span style="color:#B5695999">"</span><span style="color:#A0ADA0"> // 123</span></span>
<span class="line"></span></code></pre>
"""
let node = HtmlNode.Parse(inputHtml) |> List.exactlyOne
let outputHtml = node.ToString()
printfn "%s" outputHtml
Maybe I have missed something?
Nope, seems like a bug to me. Might be worth adding a test case and seeing how this function fares: https://github.com/fsprojects/FSharp.Data/blob/main/src/FSharp.Data.Html.Core/HtmlNode.fs#L115-L174
I have created a test-case and made a potential fix here: https://github.com/fsprojects/FSharp.Data/pull/1510
However, I'm not sure if the logic is correct for all cases - are pre tags special in HTML?
Yeah, they're meant to preserve whatever formatting is within them (non-html syntax). In this case we're not respecting that.