Stirling-PDF icon indicating copy to clipboard operation
Stirling-PDF copied to clipboard

[Bug]: PDF to Word Conversion Creates Text Boxes Instead of Editable Text

Open cajamag opened this issue 4 months ago • 8 comments

Installation Method

Docker

The Problem

When converting a PDF document to Word format using PDF Stirling, the conversion process creates text boxes (shapes) for each piece of text instead of generating proper editable text content. This makes the resulting Word document difficult to edit and format properly.

The conversion creates individual text boxes/shapes for each text element from the PDF, resulting in a Word document where all text is contained within separate text box objects rather than flowing as normal document text.

Version of Stirling-PDF

1.2.0

Last Working Version of Stirling-PDF

No response

Page Where the Problem Occurred

pdf-to-word

Docker Configuration


Relevant Log Output


Additional Information

No response

Browsers Affected

No response

No Duplicate of the Issue

  • [x] I have verified that there are no existing issues raised related to my problem.

cajamag avatar Sep 01 '25 19:09 cajamag

Hi @cajamag! I'm Dosu and I’m helping the Stirling-PDF team.

This behavior is a limitation of how LibreOffice handles PDF imports when converting to Word format—each text element from the PDF is placed in a separate text box or shape, rather than as flowing, editable text. Stirling-PDF uses LibreOffice for this conversion and does not provide configuration options to change this output structure. There are no settings or workarounds in Stirling-PDF to force editable text output; the only way to change this would be improvements in LibreOffice itself or significant changes to the application code. For now, if you need fully editable Word documents, you may want to try alternative PDF to Word tools that specialize in text extraction and reflow, as this is a known limitation of the current approach in Stirling-PDF (source).

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Sep 01 '25 19:09 dosubot[bot]

That is not a solution. In previous versions (I don’t recall which one), when we converted PDF files to DOCX, the text boxes did not appear.

Could you please clarify why this behavior has changed and if there is a way to achieve the same result as before?

cajamag avatar Sep 09 '25 21:09 cajamag

The change in DOCX output—where text boxes now appear instead of editable, flowing text—is due to how LibreOffice handles PDF imports during conversion. Stirling-PDF relies on LibreOffice for PDF to Word conversion, and the Docker setup installs LibreOffice from your Linux distribution’s package repository. This means the LibreOffice version can change with system updates or Docker image rebuilds, and different versions of LibreOffice may produce different DOCX structures from the same PDF. There are no configuration options or settings in Stirling-PDF to control this behavior or revert to the previous output; the conversion process uses LibreOffice’s defaults, and this limitation has been reported in both older and newer versions of Stirling-PDF depending on the LibreOffice version and the PDF content itself reference.

If you need the previous behavior, the only potential workaround is to manually pin or install a specific LibreOffice version in your Docker container that produced the desired output, but this is not documented or officially supported in Stirling-PDF reference. For consistently editable DOCX output, you may need to use a dedicated PDF-to-Word tool that specializes in text extraction and reflow, as this is a known limitation of the current approach reference.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Sep 09 '25 21:09 dosubot[bot]

Same problem here, why isn't possibile to choose from "boxes" to "flow"? I think this is a very important function of Stirling PDF.

Our users back to use external services because of this issue.

kintaro1981 avatar Sep 18 '25 14:09 kintaro1981

@kintaro1981 Same problem for me. I blocked all external pages, but users can’t use PDF Stirling because the most important function is PDF to Word.

I’d really appreciate it if this issue could be resolved.

cajamag avatar Sep 18 '25 15:09 cajamag

@dosu are you suggesting to use a dedicated conversion tool instead of Stirling PDF? I would like to use only one tool like, I think, many other users.

kintaro1981 avatar Sep 22 '25 10:09 kintaro1981

any news ????

cajamag avatar Oct 01 '25 03:10 cajamag

HI, any news??

cajamag avatar Oct 16 '25 14:10 cajamag