BookStack icon indicating copy to clipboard operation
BookStack copied to clipboard

PDF Export not exporting images

Open mariusdill opened this issue 1 year ago • 17 comments

Describe the Bug

When I export a page as PDF the PDF does not include the images. It only contains the name and link of the image. Behind the name it states "Image not found or type unkown".

The weird thing about this problem is that it works for recently created pages. I cant tell since when it doesnt work but a few month ago I could export the old pages correctly.

eistik-updateanleitung.pdf

Steps to Reproduce

Export Page -> PDF

Expected Behaviour

Exported PDF includes all images

Screenshots or Additional Context

2023-06-13 08_18_42-Window

Browser Details

Chrome 114 on Windows 10

Exact BookStack Version

23.05.2

PHP Version

8.2.6

Hosting Environment

Ubuntu 22.04 VM Docker Compose Setup from Linuxserver.io

mariusdill avatar Jun 13 '23 06:06 mariusdill

This will usually occur if BookStack can't trace the image file to a local resource based upon the URL. Have you perhaps changed the base URL at some point? Even if to just https:// from http://?

Otherwise, could possibly be due to image file access permissions, and existence/permissions upon the page the images were originally uploaded to, but I haven't fully validated that as I haven't been in that part of the code for a while.

ssddanbrown avatar Jun 13 '23 15:06 ssddanbrown

If I remember correcly I had the APP_URL not set at my initial setup because it didnt work if I set it to the right URL. But I changed it to the right URL like a year ago.

mariusdill avatar Jun 14 '23 11:06 mariusdill

The URL of the image is correct, I can open it in boostack.

mariusdill avatar Jun 14 '23 12:06 mariusdill

The URL of the image is correct, I can open it in boostack.

But does the start of it, when copied into a text editor (Not when opened via browser) exactly match the configured APP_URL value? Including starting https:// or http:// component?

ssddanbrown avatar Jun 14 '23 12:06 ssddanbrown

Yes

http://bookstack/uploads/images/gallery/2022-02/eisitk-net-update.png

APP_URL= http://bookstack

mariusdill avatar Jun 14 '23 12:06 mariusdill

We use the PDF export as a backup. It's very important for my department that these manuals are available when our system is down. Is there any other way to do this? The case scenario would be to use it offline from a USB stick.

mariusdill avatar Jul 12 '23 08:07 mariusdill

The thing BookStack generated isn't a "portable document format" document because it does not encapsulate the images - just points to external URLs. I suppose this is useful in some cases, like keeping images secure.
I have a work-around that I think works... first Export your page or book as HTML (which will have URLs), then open the HTML document in a browser and export that as .pdf at least I think that works... I am running on an isolated secure computer, so I have to put the file on a thumb drive (no e-mail for me) and take it to an outside computer. So far I think it works. I see that Stackoverflow has a topic on "adding an image with dompdf" so I know it's possible to export portable documents directly if BookStack wants this.
An option to include the image's filename under the image might be nice too.

cdieterich-nj avatar Jul 20 '23 22:07 cdieterich-nj

Thanks for your tip. When I export the pages in HTML the images are all there. Very weird issue but at least I have a working offline backup now :)

mariusdill avatar Oct 13 '23 07:10 mariusdill

For what it's worth, Dan, I think an option (in the admin-cli perhaps?) to validate and perhaps correct media URLs would be helpful. We've also seen this issue, and we hacked around it (to create HTML then PDF) but in our case government regulators require PDF and we can't provide HTML. So something an end user should be able to accomplish has to go to help desk. Not urgent of course, we have workarounds...

crashmaster18 avatar Oct 15 '23 12:10 crashmaster18

Modern browser’s native print functionality can save the results as a PDF file. This is something most users should be able to handle.

otherjoel avatar Oct 15 '23 16:10 otherjoel

Sure you can but when you save important infomation in bookstack that needs do be available in case of an outage you need a offline backup. And the only way for this is to export all books with a script.

mariusdill avatar Oct 16 '23 07:10 mariusdill

@mariusdill Are you using the WYSIWYG Docx Import Hack when creating pages? I am getting the "image not found or type unknown" error when exporting to PDF for a bunch of pages. The common theme is that they were created from docx files, so that import seems to break something on the export. I notized that if I edit the page, and click on each image and click the "Add/Edit Image" option to bring up the edit image window and then close it, it fixes the problem.

EDIT: And in my case, it appears because the import is dropping two returns into the alt text of the image: image

Any of the image entries with the line returns are the ones that do not export. image

@ssddanbrown is there a way to mass update all tables to find/replace these line returns and remove them?

mdezzi avatar Dec 11 '23 20:12 mdezzi

No but we copied a lot of images from word documents so it could be the same problem.

mariusdill avatar Dec 11 '23 21:12 mariusdill

in our shop all images come directly from capture by cell phone cameras to bookstack.

On December 11, 2023 4:10:42 PM EST, mariusdill @.***> wrote:

No but we copied a lot of images from word documents so it could be the same problem.

-- Reply to this email directly or view it on GitHub: https://github.com/BookStackApp/BookStack/issues/4300#issuecomment-1850893186 You are receiving this because you commented.

Message ID: @.***>

cdieterich-nj avatar Dec 12 '23 02:12 cdieterich-nj

As @mdezzi mentioned I used WYSIWYG Docx Import Hack too so now I got problem exporting pdf. So I went searching database for the pages records and I found out that images had an alt tag which was broken with CRLF breaks. notepad++_iH5PxKIOWg Removing such breaks repairs PDF exporting. I think it's probably caused by mammoth.js library which uses auto generated .docx alt descriptions for image inserting.

itsTurnip avatar Feb 20 '24 11:02 itsTurnip

@ssddanbrown I am facing an issue after exporting the document in pdf format, the images goes disappear, and showing unknown type and also giving the reference to the image, when opened image is visible. Can you look into this issue. image

after clicking on the text showing actual image. image

@ssddanbrown look into the issue, waiting for your help.

Iam using UbuntuOS with apache2

shaiksohail11 avatar Apr 22 '24 12:04 shaiksohail11

We have the same issue. The observations we made:

  • If I put the image to the page from printscrin then the export to pdf is good, image is exported
  • If I put the image from books attachements, then the image is not exported

image image

maks2199 avatar Oct 09 '24 07:10 maks2199