[Bug]: flatten form field with text inside. Text is lost.
Installation Method
None
The Problem
When I flatten a pdf with lots of form fields. I expect the text inside the forms to be saved and only form field removed.
Stirling remove all. No text visible after.
Is this a bug or design and how can I save the text inside the form field?
Version of Stirling-PDF
0.29
Last Working Version of Stirling-PDF
No response
Page Where the Problem Occurred
No response
Docker Configuration
No response
Relevant Log Output
No response
Additional Information
No response
Browsers Affected
No response
No Duplicate of the Issue
- [X] I have verified that there are no existing issues raised related to my problem.
Seems like a bug could you provide an example pdf for testing?
Will work on this. Waiting for pdf
I've just run it again and I ticked the box "flatten form fields only" this time it worked it preserved the form but there's a partial corruption of the file meaning when I open the pdf in my pdf editor (pdf xchange) it reports red flag warning "ERRORS DETECTED IN THE XREF TABLE"
is this something expected or simple? i know it's not a big deal and could be ignored but i wish to find a way to clear this.
i will provide a sample pdf asap for you to test with yourself.
sample file. so if you tick "flatten form fields only" it works & preserves the text. but need to resolve the errors XREF after flattening.
@Frooodle can you assign me?
Separate question. I want to use this as a web api but I cannot load the api documentation on swagger it is stuck on loading page.
Can you tell me what is the web api command to do flatten + form fields only?
@Frooodle this works https://stackoverflow.com/a/71159599.
It happens with every pdf returned by this method pdfDocToWebResponse.
Solution
document.save(baos, CompressParameters.NO_COMPRESSION);
Is it worth it?
hi team is there an update i can download with the above code change?
I'm happy to use a BETA and provide feedback.
@Frooodle this works https://stackoverflow.com/a/71159599. It happens with every pdf returned by this method
pdfDocToWebResponse. Solutiondocument.save(baos, CompressParameters.NO_COMPRESSION);Is it worth it?
Sorry for not getting back, if the compression size difference isn't huge then we can certainly go with this
Can you please compile a build I can test today and give feedback?
@Frooodle warning comment above me from @ummm288 looks like a virus can you please ban/delete.
Also any update on this :) Can you provide a test build release?
I tried many other functions of Stirling to see if I can fix the xhref issue but it won't work.
Hi @Frooodle @iib0011 could we implement the suggestion fro @iib0011 please to fix the Red XHREF WARNING errors?
As this is a minor change can it be included in next bugfix release?
@Frooodle the bug is fixed since I updated to latest version today: 0.43.2
Did you deliberately fix it or just lucky? :)
Lucky 😂 Thanks for the update!
One last question can you help me with what is the correct API HTTP call to use this function?
edit: I found this. will test: https://app.swaggerhub.com/apis-docs/Frooodle/Stirling-PDF/0.43.2#/Misc/flatten
Sorry I was incorrect. I tested the wrong PDF sample somehow... The XHREF errors are still present after flattening a sample file.
Here is the sample PDF for flatten testing. Can we please reopen the bug? Can the proposal from @iib0011 be considered for next release testing?
Or can I beta test a branch version for you?
Hi quick update. still happening on version 1.2.0 just tested today :(
Hi,
So I've been working on the fill-form stuff and I have strong suspicion that I encountered this and more or less solved. The text is being lost because the fields don’t have proper appearance streams when flattening/optimizing, not because of "compression" itself; the correct fix is to regenerate appearances (refresh) and then flatten, with NeedAppearances disabled so viewers don’t rely on implicit rendering paths...
The more in-depth explanation why this happens; AcroForm field values live in field dictionaries (e.g., the /V entry), but what actually renders on the page is the widget’s appearance stream (/AP), which must exist or be generated before flattening or further optimization (because one of the ways compression works is yes you guessed it they remove lower level objects that are not necessary needed for "only" viewing the content). If fields only have values and no up‑to‑date /AP, many tools won’t show anything when those fields are flattened or annotations are discarded, because they only burn in what the /AP contains, not the raw /V value
Solution should be alongside of using RefreshFieldAppearances/refreshAppearances and setting NeedAppearances to false which regenerate /AP objects based on the "lower-level" /V object (that include the "new" text)
Oh yeah on related note: #2006 meaning this "flatten form field with text inside. Text is lost"
And #3920 are not the same issue in any way. XREF fix would require wildly different code than this.
Status says; Status: Next to pickup
but is anybody actually working on this? This should be a good first issue for people.