BookStack
BookStack copied to clipboard
HMTL export taking longer then 1 minute
Describe the Bug
Attempting to do an HTML export fails after one minute, results in 504 error.
Steps to Reproduce
Using either export-books.php or via UI
Attempt to generate an html export of a book.
Expected Behaviour
HTML would be downloaded.
Screenshots or Additional Context
The txt download is ~533kB
Log from console.
2024-06-03T18:51:46.963675560Z % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
2024-06-03T18:51:47.105897343Z
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 2572 100 2572 0 0 18402 0 --:--:-- --:--:-- --:--:-- 18503
2024-06-03T18:53:00.705073606Z PHP Warning: file_get_contents(http://bookstack-service.wiki/api/books/28/export/html): Failed to open stream: HTTP request failed! HTTP/1.1 504 Gateway Time-out
2024-06-03T18:53:00.705112372Z in /export-books.php on line 74
Browser Details
No response
Exact BookStack Version
v24.05.1
PDF also times out
Hi @jonathon2nd, Exports can take a while if there's a lot of content, and sometimes in rare cases specific content can trip up the exports system and cause more work than expected to be done. Really, this is the kind of thing I'd need to replicate with the same content to actually testing.
Do other books in the system also time-out, even if simple? You could maybe clone the book and delete parts of it to help identify if it's mainly down to a specific page or collection of pages.
Check your logs ... you may need to change memory limits or execution timeout in php.ini
@M0n7y5 Both had already increased. I am now running into Cloudflare timeout. No errors in container logs.
@ssddanbrown We have no other books that have the timeout. Once the book is split up, we will export each one and see if it is a problem because of content type, not necessarily the size of the book.
The txt download is ~533kB The md download is ~775kB
The book has been refactored, still failing to export to html in 1 minute
txt export size: ~150kB md export size: ~250kB
Able to export each page individually
You need to tell cloudflare to wait longer for server to respond. Cloudflare thinks server is down while your book is converting to PDF.
Also one page taking 120MB is crazy ... What kind of content do you have on your pages?
Lots of photos.
Whats strange is that those couple of huge individual pages take no more then ~3 seconds. Most others were instant. So not sure why the book export explodes.
Yeah, 120MB is super high. If the pages are exporting quick, might indicate hitting some kind of memory limit or exhaustion, or just that HTML is just too large to be handling without problems. There might be a more efficient way for us to do the embed/parsing (placeholder then simple string replacements at the end) but at those kinds of sizes, I'd be surpised if there are not other issues that pop up anyway. The formats we produce aren't really great for high-image/data content tbh.
The issue here is that parsing HTML takes a lot of memory and converting it to PDF is CPU intensive task because all of this is done in old PHP library. PHP itself is just slow. I solved my issue by using https://gotenberg.dev/ and overriding the PDF Export. It also solves a lot of weird issues with some Unicode stuff. It uses headless Chrome under the hood.