jsoup
jsoup copied to clipboard
Allow configuration of self closing for tags
When using web components / custom elements the tag will always be marked as selfClosing=false and I can see no way of changing this
Not developing. Just for help.
I believe it's mentioned here: http://jsoup.org/news/release-1.8.1
"Introduced the ability to chose between HTML and XML output, and made HTML the default. This means img tags are output as , not
. XML is the default when using the XmlTreeBuilder. Control this with the Document.OutputSettings.syntax() method."
The parser already marks unknown (i.e. custom) tags as selfClosing if they are encountered as such in the input, but there is no such support for tags created programatically using Tag.valueOf. This means that as a workaround, you could use something like Jsoup.parse("<my-tag />").body().child(0).tag() instead of Tag.valueOf("my-tag"), although that solution is of course quite horrible from a performance point of view.
I'm not sure whether it would make sense to just add an overload of valueOf that also takes a boolean for controlling the selfClosing field, or if it would be wiser to introduce a TagBuilder that would allow customising all aspects of the configurations affecting how the tag is output.
On the other hand, this might not make any sense at all since "html5" has no concept of self-closing tags as explained in e.g. http://stackoverflow.com/a/3558200 other than for "foreign" elements, i.e. MathML and SVG elements. The Custom Elements specification doesn't define such elements as "foreign" nor does it define any new syntax that would support self-closing tags.
Jsoup supports XML parsing and generation, not just HTML. Support for non-HTML5 self-closing tags is mandatory for XML generation. It would be much nicer to be able to either pass a flag to Tag#valueOf or modify the self-closing attribute. I use the following utility method for creating self-closing elements in XML:
public static Element createVoidElement(String tagName, String baseUri) {
Document document = Jsoup.parse("<" + tagName + "/>", baseUri, Parser.xmlParser());
document.outputSettings().prettyPrint(false);
document.outputSettings().escapeMode(Entities.EscapeMode.xhtml);
return document.body() == null ? document.child(0) : document.body().child(0);
}
As described above, the parsed Element will have the self-closing (internal) flag set to true and will render as expected.
The problem allowing for change the self-close property is that a developer can set it as true and the element having children, which would result to an incorrect xml. I think the model is fine, just add some static methods for allowing us to register our tags, because no body has dinamic tags, we all has specific tags for our project, that can be defined in developing time. If the needs increases, the tags increases or changes, and that's it.
Is there a way i can create a tag <my-tag/> like this using jsoup?
For who needs to convert self closing tags to non-self closing tags, there is a tricky solution using PowerMock
Document document = Jsoup.parse(...);
document.traverse(new NodeVisitor() {
@Override
public void head(Node node, int depth) {
if (node instanceof Element) {
Element el = (Element) node;
Whitebox.setInternalState(el.tag(), "selfClosing", false);
}
}
@Override
public void tail(Node node, int depth) {
}
});
To transform
<a>
<b>
</a>
into
<a>
<b />
</a>
you should enable xml mode to generate well-formed html:
document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);