TwelveMonkeys
TwelveMonkeys copied to clipboard
Use Document in SVGImageReader's setInput
(I thought it was better to create a new topic for this, see #442 for background)
The change you proposed to let setInput
receive a Document
object does in fact work (with ~~a few limitations that I will enumerate later~~ one limitation - more on that later).
I do think it would be a good thing to implement in TwelveMonkeys, since it is quite easy and makes the reader more friendly. So the changes that need to be done in TwelveMonkeys are:
- In
SVGImageReader.java
exactly as you said:
@Override
public void setInput(Object pInput, boolean seekForwardOnly, boolean ignoreMetadata) {
super.setInput(pInput, seekForwardOnly, ignoreMetadata);
if (imageInput != null) {
TranscoderInput input = new TranscoderInput(IIOUtil.createStreamAdapter(imageInput));
rasterizer.setInput(input);
} else if (pInput instanceof Document) {
Document document = (Document) pInput;
TranscoderInput input = new TranscoderInput(document);
input.setURI(document.getBaseURI());
rasterizer.setInput(input);
}
}
- In
SVGImageReaderSpi.java
, change thecanDecode
to something like (a more specific type of Document may be needed, later on limitations):
public boolean canDecodeInput(final Object pSource) throws IOException {
return (pSource instanceof Document) || (pSource instanceof ImageInputStream && canDecode((ImageInputStream) pSource));
}
- And finally, in
SVGProviderInfo.java
, this must be added (a more specific type of Document may be needed, later on limitations):
private final Class[] inputTypes = new Class[] {ImageInputStream.class, Document.class};
@Override
public Class[] inputTypes() {
return inputTypes;
}
And that's it! Works (almost) perfectly. You may want to add some test cases to check the new feature. I'll share the code on how to create a Document object from a file next, but first the ~~limitations~~ limitation:
~~1) The CSS problem still exists when using a Document as input. CSS will not be used if the xml-stylesheet tag is before the SVG tag.~~
- Using the standard org.w3c.dom.Document does not work. It gives ClassCastExceptions, here is an example:
java.lang.ClassCastException: org.apache.batik.dom.GenericElement cannot be cast to org.w3c.dom.svg.SVGSVGElement
at org.apache.batik.anim.dom.SVGOMDocument.getRootElement(SVGOMDocument.java:234)
at org.apache.batik.bridge.BridgeContext.getBridgeExtensions(BridgeContext.java:2054)
at org.apache.batik.bridge.BridgeContext.registerSVGBridges(BridgeContext.java:2044)
at org.apache.batik.bridge.BridgeContext.setDocument(BridgeContext.java:445)
at org.apache.batik.bridge.GVTBuilder.build(GVTBuilder.java:54)
at com.twelvemonkeys.imageio.plugins.svg.SVGImageReader$Rasterizer.transcode(SVGImageReader.java:322)
at org.apache.batik.transcoder.XMLAbstractTranscoder.transcode(XMLAbstractTranscoder.java:142)
at org.apache.batik.transcoder.SVGAbstractTranscoder.transcode(SVGAbstractTranscoder.java:156)
at com.twelvemonkeys.imageio.plugins.svg.SVGImageReader$Rasterizer.init(SVGImageReader.java:561)
at com.twelvemonkeys.imageio.plugins.svg.SVGImageReader$Rasterizer.getDefaultWidth(SVGImageReader.java:574)
at com.twelvemonkeys.imageio.plugins.svg.SVGImageReader.getWidth(SVGImageReader.java:240)
at com.twelvemonkeys.imageio.plugins.svg.SVGImageReader.paramsToHints(SVGImageReader.java:180)
at com.twelvemonkeys.imageio.plugins.svg.SVGImageReader.read(SVGImageReader.java:129)
...
There is another person with the same problem: https://stackoverflow.com/questions/44029119/using-vanilla-w3c-document-and-apache-batik-svg-rasterizer
So, instead of creating a Document like this:
try {
ImageReader reader = ...
File svgFile = ...
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Probably good idea to set some configs, specially to disable loading DTD that causes a 15s wait.
dbf.setNamespaceAware(true);
dbf.setValidating(false);
dbf.setExpandEntityReferences(false);
dbf.setCoalescing(true);
dbf.setIgnoringComments(false);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
Document doc = dbf.newDocumentBuilder().parse(svgFile);
// ? doc.setDocumentURI(svgFile.getParentFile().toURI().toString());
reader.setInput(doc);
...
} catch (IOException ex) {
...
}
one mut use the Batik way:
try {
ImageReader reader = ...
File svgFile = ...
String parser = XMLResourceDescriptor.getXMLParserClassName();
SAXSVGDocumentFactory df = new SAXSVGDocumentFactory(parser);
Document doc = df.createDocument(svgFile.toURI().toString());
// ? doc.setDocumentURI(svgFile.getParentFile().toURI().toString());
reader.setInput(doc);
...
} catch (IOException ex) {
...
}
For this reason, it may be a good idea to force Document input to be of type SVGOMDocument (or maybe the more generic SVGDocument).
While it is very saddening that one cannot use a plain org.w3c.dom.Document (thus making the code plugin-aware / Batik-libraries-aware) ~~and that it does not fix the CSS problem that I was already experiencing when converting the Document to a byte array stream~~, I think it is a nice feature to add in TweleveMonkeys as it wouldn't hurt anyone.
The CSS problem is my fault after all, argh! When I deep clone the Document, somehow the CSS line gets lost. So there is no issue at all with the CSS, you can forget that.
Excellent!
If we can only reliably support SVGDocument
or SVGOMDocument
, we should change the supported input type to that. The plugin does depend on Batik anyway, so I don't think that is much of a problem. We do need to make sure that the Spi does not crash, but silently de-register itself if Batik isn't available, though.
Can you add a test case too, and perhaps create a pull request for this? I think this would be a useful contribution. :-)
Best regards,
-- Harald K
Well, a work around to also support the vanilla Document would be to get an ImageInputStream over the in-memory object. Like this:
/**
* Gets an ImageInputStream to the SVG XML Document.
*
* @param svgXML - Document to which a stream will be returned.
* @return An ImageInputStream to the SVG XML Document.
* @throws java.io.IOException
* @throws javax.xml.transform.TransformerConfigurationException
* @throws javax.xml.transform.TransformerException
*/
public static ImageInputStream getDocumentInputStream(Document svgXML) throws IOException, TransformerConfigurationException, TransformerException {
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
Result outputTarget = new StreamResult(outputStream);
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.VERSION, svgXML.getXmlVersion());
transformer.setOutputProperty(OutputKeys.STANDALONE, svgXML.getXmlStandalone() ? "yes" : "no");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
//transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
//transformer.setOutputProperty("{http://xml.apache.org/xalan}indent-amount", "2");
transformer.transform(new DOMSource(svgXML), outputTarget);
//System.out.println(new String(outputStream.toByteArray(), transformer.getOutputProperty(OutputKeys.ENCODING)));
return ImageIO.createImageInputStream(new ByteArrayInputStream(outputStream.toByteArray()));
}
}
This is how I'm doing it in my image viewer, since I need to use a vanilla Document to do some XML tweaking.
Would this be interesting? I know it is not that elegant, I mean, converting an already parsed document to bytes to let batik re-parse it again later, but I'm using it and it is quite fast. I mean, I don't know what happens if the SVG file size is in the order of MB...
Okay, I just created the pull request including the hack for accepting vanilla Documents. You can easily remove it if you feel it is not appropriated ; )
About the test cases, I don't really understand how/where to do it. But maybe this method will be handy to you:
public Document loadXMLDocument(File file, boolean vanilla) throws IOException, ParserConfigurationException, SAXException{
Document doc;
if(vanilla){
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Probably good idea to set some configs, specially to disable loading DTD that causes a 15s wait.
dbf.setNamespaceAware(true);
dbf.setValidating(false);
dbf.setExpandEntityReferences(false);
dbf.setCoalescing(true);
dbf.setIgnoringComments(false);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
doc = dbf.newDocumentBuilder().parse(file);
} else {
String parser = XMLResourceDescriptor.getXMLParserClassName();
SAXSVGDocumentFactory df = new SAXSVGDocumentFactory(parser);
doc = df.createDocument(file.toURI().toString());
}
if(doc != null) {
doc.setDocumentURI(file.toURI().toString());
}
return doc;
}
If you use vanilla==true it will return a simple org.w3c.dom.Document, otherwise it will return a Batik SVGDocument.
Thanks!
Okay, I won't merge directly without a test case, but I might look into it at some point.
I any case, I don't think I like the serialize/deserialize-hack for handling plain Document
s. In my mind, it kind of defeats the purpose of the feature, which is to make it faster to render a document if you already have it parsed in memory.
If we need to serialize and re-parse from an ImageInputStream
, you could already do this without this feature, externally.
-- Harald K
Sure, I totally understand that. Feel free to remove the hack, I had it already implemented in my code anyway : )