kotlinx.html icon indicating copy to clipboard operation
kotlinx.html copied to clipboard

head tag generated with forced meta tag

Open Shengaero opened this issue 7 years ago • 5 comments

Hello,

I was using the JVM version of this library to generate an HTML file with a head tag on when I noticed that it generates a forced meta tag.

More specifically:

html {
    head {
        // some code
    }
}

Generates to this:

<html>
  <head>
    <META type="Content-Type" content="text/html; UTF-8"/>
    <!-- Some code -->
  </head>
</html>

I've looked in various places to see if there's a way to prevent this from happening, but have had no luck.

Shengaero avatar Dec 08 '17 16:12 Shengaero

How do you know that it is generated? Most likely it is appended by the browser

cy6erGn0m avatar Dec 15 '17 17:12 cy6erGn0m

I never opened this in a browser. This is the direct output to a file, via a JVM program.

Shengaero avatar Dec 15 '17 18:12 Shengaero

To clarify my issue, when I use the word "generated" I am using it in the context of Kotlin.html creating an HTML string and writing it to a file, ergo: "code generation".

My original issue was written at a bad time where I didn't have access to my computer, and was somewhat paraphrased.

The exact output that is "generated" looks like this:

<!DOCTYPE html>
<html>
  <head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <!-- Other Head Tags -->
  </head>
  <!-- Other HTML Tags -->
</html>

From my debugging I found that it always generates at the top of the head tag, and even generates if I leave the head block empty in kotlin:

fun example(html: HTML) = html.apply {
    head {
        // No code here
    }
}

This will produce the following text when the html is serialized to a string and printed to an HTML file:

<!DOCTYPE html>
<html>
  <head>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
</html>

As some final context, to make sure that there isn't anything possibly missing, the code that I use to write the html to a file is below:

val document: Document = // ...
val outputFile: File = // ...

outputFile.writer(Charsets.UTF_8).use { it.write(document, prettyPrint = true) }

Again, I have looked into a way to possibly prevent this, as I thought it might be my own fault, and my current fix is to post process the HTML string before writing it to a file, replacing the exact character sequence that is generated (<META http-equiv="Content-Type" content="text/html; charset=UTF-8">) with an empty string.

The code that I first encountered this behavior can be found here, although I doubt it helps, nor matters, as this has happened in completely separate testing I've done.

Shengaero avatar Dec 15 '17 23:12 Shengaero

I'm pretty sure it's not kotlinx.html, may be it's Document writing something? E.g. similar issue is reported for Jekyll (https://github.com/jekyll/jemoji/issues/41)

orangy avatar Jan 07 '18 14:01 orangy

For sure meta tag is generated by XML Transformer (javax.xml.transform.Transformer)

It's easy to see that DOM rendering itself looks like the following (excerpt from kotlinx.html)

fun Writer.write(element: Element, prettyPrint : Boolean = true) : Writer {
    val transformer = TransformerFactory.newInstance().newTransformer()
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes")
    transformer.setOutputProperty(OutputKeys.METHOD, "html")
    transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8")

    if (prettyPrint) {
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2")
        transformer.setOutputProperty(OutputKeys.INDENT, "yes")
    }

    transformer.transform(DOMSource(element), StreamResult(this))
    return this
}

Don't you have any dependencies that could override transformer factory or inject custom transformer via SPI? Which class of the instance returned by TransformerFactory.newInstance().newTransformer() ? I have it com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl by default.

You also could try to rewrite the function above and configure transformer. Perhaps you could try to eliminate transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8")

cy6erGn0m avatar Jan 09 '18 08:01 cy6erGn0m