"Unauthorized path host" exception is a breaking change and needs better configuration to workaround
When upgrading to version 5.3.2, we began experiencing a new exception "Spipu\Html2Pdf\Exception\HtmlParsingException: Unauthorized path host"
This is a result of increased security handling introduced here: https://github.com/spipu/html2pdf/commit/ff07b14d5d153c1c3b3a8fc878e0195881a2d45a#diff-eb2dc4f9754cd33a8b40b53476fb59fa7ef092b51b463f069c42e0079c1b81c1
As this is a change in default behavior, this should be considered a breaking change, but was instead included as part of a patch version update.
There are a two ways to work around this change but they both have tradeoffs that are not ideal:
- Overwrite the default Security service with a custom class using
$html2Pdf->setSecurityService. The main downside to this is that we no longer benefit from whatever updates this package makes to the default Security service. Overrides should be used to extend behavior, not turn if off. - Build out our allowlist using
$html2pdf->getSecurityService()->addAllowedHost. This is unnecessarily cumbersome when the HTML is generated from a controlled source where all URLs can be trusted. Implementing this would require undue effort on our end just to maintain existing behavior and introduces unwanted coupling between initial HTML generation and subsequent PDF generation.
Suggested solutions:
- Treat an empty array of allowedHosts as allowing all hosts.
- Adding an option to DISABLE allowedHosts checking. Or in order to make this a non-breaking change, add an option to ENABLE allowedHosts checking and require consumers to opt-in.
Hi, this is a security patch, that why it is release in a patch version, even if it is also a breaking change. But i will add new methods on the security service to be able to disable the check on the allowed hosts.
question: why do you need to use http external resources ? if those files are on your server, it is better and faster to use absolute path instead
https://github.com/spipu/html2pdf/commit/b5242416105db0d6c87851676d08cb32ef3bd7e2
is it ok for you ?
question: why do you need to use http external resources ? if those files are on your server, it is better and faster to use absolute path instead
The external resource we are using is stored in AWS's S3 service. I suppose it would be possible for us to download the image locally as part of this PDF generation process, but that was not how the feature was implemented originally.
is it ok for you ?
I haven't tested the code but that seems like a sufficient and sane implementation to address our need. Thank you for the quick response!
If the generated PDF file has a local cache, it will be fine — even if you need to generate a unique download link for each S3 file, in case your S3 bucket is not fully public. Making your S3 bucket public should only be considered if the files do not contain personal data.
The issue is that html2pdf does not use local caching for files — it fetches them from the network multiple times, if it needs to precalculte some parts of the html.
It would be much better and faster to preload the resources locally first, and then use absolute paths.
If the generated PDF file has a local cache, it will be fine — even if you need to generate a unique download link for each S3 file, in case your S3 bucket is not fully public. Making your S3 bucket public should only be considered if the files do not contain personal data.
The issue is that html2pdf does not use local caching for files — it fetches them from the network multiple times, if it needs to precalculte some parts of the html.
It would be much better and faster to preload the resources locally first, and then use absolute paths.
These are all good points and I'll try to keep them in mind if we ever need to address performance issues with this PDF generation process. That being said, the current performance of our PDF generation process is sufficient and we have many other higher priorities to address so it's not feasible for us to spend more time re-factoring that process right now.
released
Mmm I have just installed it and I'm getting the same error.
Mmm I have just installed it and I'm getting the same error.
Calling the new method introduced in 5.3.3 fixed the issue for me: $html2pdf->getSecurityService()->disableCheckAllowedHosts();
@spipu thanks again for the prompt response! I really do appreciate it.
You're welcome.
Please be cautious when using this new method that disables the security feature. You do so at your own risk, and I cannot be held responsible in case of a security issue or attack.