openhtmltopdf icon indicating copy to clipboard operation
openhtmltopdf copied to clipboard

During replacing Element cannot accept SVG URI with param

Open omna-manz opened this issue 3 years ago • 5 comments

In our project, there are some img elements in HTML which link to SVG resources. By Debugging is noticed that method ### "createReplacedElement()" in the class "PdfBoxReplacedElementFactory" cannot load SVG resources, if there is a param at the end of URI. please see below example:

<img src="http://server.com/logo.svg?ln=images&v=20210921082157" alt="logo">

It leads to the fact that library trys to make an PdfBoxImageElement instead of PdfBoxSVGReplacedElement and then it cannot read the resource as well.

com.openhtmltopdf.load WARNING:: Unrecognized image format for: /logo.svg?ln=images&v=20210921082157
com.openhtmltopdf.exception WARNING:: Can't read image file; unexpected problem for URI '/logo.svg?ln=images&v=20210921082157'

Note: THe URI Address is not changeable.

omna-manz avatar Sep 21 '21 13:09 omna-manz

hi @omna-manz as you have noticed, the issue is that as we don't have access to the mime type in this specific part of the code, we thus rely on a "simple" solution:

https://github.com/danfickle/openhtmltopdf/blob/ccd29f03ede2aecadac9c39fda95a5fedfb23645/openhtmltopdf-pdfbox/src/main/java/com/openhtmltopdf/pdfboxout/PdfBoxReplacedElementFactory.java#L65

we currently check if the url end with .svg . A possible non disruptive solution would be to create a URL object and ignor the query param. An even better but more instrusive would be to finally take in to account the mime type returned by the server (or file system).

syjer avatar Sep 22 '21 11:09 syjer

Hi @syjer, Thank you for your response and explanation. Regarding to your ideas, I think so it will be great when be ignored the query param or check mime type by returned file. But it couldn't possible (at least for me) out of box. Because one side the mentioned class is used inner project and another side, it isn't possible to load data without its query params.

omna-manz avatar Sep 22 '21 13:09 omna-manz

@omna-manz, Until we fix this, you could just add .svg to the end of your SVG urls (in your templates) and then remove the suffix in a custom uri resolver (or protocol handler). It is a bit of a hack but should work.

@syjer, I was thinking we could add a default method to the FSStream interface to get the type of a resource. This would allow the user to use whatever logic or method needed to determine content type. By default it would just return UNKNOWN to use current behaviour. What do you think?

With SVGs we also have to think of security. I think I'll make the default external resource access controller reject external SVGs. This would be a backwards-compatible breaking change but from a security perspective is probably right, especially if we change the content type detection.

Finally, I need to audit external resource access to make sure the controller is used in all places.

danfickle avatar Sep 26 '21 06:09 danfickle

@danfickle nice idea. Will be interesting to see how it can be implemented in a way that avoid multiple fetch on the resource.

An implementation of "fsstream.getType() and then fsstream.getStream()" should avoid fetching 2 times the resource :)

syjer avatar Sep 26 '21 15:09 syjer

Hi @danfickle, is there any news about it in next release?

omna-manz avatar Dec 20 '21 10:12 omna-manz