pdf2htmlEX icon indicating copy to clipboard operation
pdf2htmlEX copied to clipboard

feature request: add support for adobe forms

Open radub-mtl opened this issue 12 years ago • 19 comments

it would be useful to display form fields and expose to the HTML / JS layer the form's attributes - for example:

This creates potential for more dynamic usage scenarios

Thanks for a great project Radu

radub-mtl avatar Nov 25 '13 22:11 radub-mtl

Can you provide a sample PDF? Thanks!

coolwanglu avatar Nov 26 '13 04:11 coolwanglu

Sure thing! Here is a simple pdf form with 2 fields, named: TextField1 CheckBox3

http://pastebin.com/Xsc12JD3

Use pbget (package pastebinit in ubuntu) to retrieve the pdf file

Radu

radub-mtl avatar Nov 26 '13 17:11 radub-mtl

Has anyone been working on this one? I'm looking out for the same feature and may be ready to spend some time on this (depending on some other things, notably some strange google chrome display bugs).

I'm asking in case someone has already started and has some pointers.

afrosimon avatar Jul 18 '14 13:07 afrosimon

I would pay for this feature if someone is able to do it...

mwaschkowski avatar Sep 14 '15 19:09 mwaschkowski

@mwaschkowski : Currently this is partially done, you can look up the --process-form parameter. I only took care of what I needed at that moment, that is checkboxes and text fields, hence why I say "partially".

I may be interested in working on this feature again, depending on your use-case...?

afrosimon avatar Sep 17 '15 13:09 afrosimon

I'm looking into hummus PDF, but will let you know if that doesn't work out.

Thank you

Mark

mwaschkowski avatar Sep 17 '15 15:09 mwaschkowski

Hey, we have an absolute need for this...

uberpu avatar Oct 02 '15 20:10 uberpu

pdf2htmlEX.exe --process-form 1 --process-annotation 1 --process-annotation 1 --bg-format svg --debug 1 --optimize-text 1 03ff74a4-61af-4116-a616-862a1ce5a4e7.pdf htms/_03ff74a4-61af-4116-a616-862a1ce5a4e7.htm

Output includes text fields, but they all have the following locations:

position: absolute; left: 0.000000px; bottom: 0.000000px; width: 0.000000px; height: 0.000000px; line-height: 0.000000px; font-size: 0.000000px;

Right number of fields, just .. well hidden?

uberpu avatar Oct 02 '15 21:10 uberpu

Hm, well first off I haven't tested this feature on windows, that might be part of the problem. What if you try with theses parameters :

--optimize-text 1 --space-as-offset 1 --process-outline 0 --process-form 1 --font-size-multiplier 10

Otherwise could you find a way to share with us this document / an excerpt?

afrosimon avatar Oct 06 '15 13:10 afrosimon

Surely.. when i get back in i will send. It seems rather random that set... --optimize-text 1 ... tried --space-as-offset 1 ... will --process-outline 0 ... tried both ways --process-form 1 ... a constant --font-size-multiplier 10 ????

Its simply that the javascript doesn't reassign position. Is there some sort of exclusion?? (Pre-optimisation === evil) On Oct 6, 2015 9:54 AM, "Simon Chenard" [email protected] wrote:

Hm, well first off I haven't tested this feature on windows, that might be part of the problem. What if you try with theses parameters :

--optimize-text 1 --space-as-offset 1 --process-outline 0 --process-form 1 --font-size-multiplier 10

Otherwise could you find a way to share with us this document / an excerpt?

— Reply to this email directly or view it on GitHub https://github.com/coolwanglu/pdf2htmlEX/issues/250#issuecomment-145862915 .

uberpu avatar Oct 07 '15 02:10 uberpu

Is there a private way of getting you these files??

uberpu avatar Oct 07 '15 22:10 uberpu

Ok I have a sample... here ...also there as .zip and .7z using

pdf2htmlEX.exe --optimize-text 1 --space-as-offset 1 --process-outline 0 --process-form 1 --font-size-multiplier 10  fe247744-a828-467d-a48f-477d9a8b8524.pdf htms\fe247744-a828-467d-a48f-477d9a8b8524.html

uberpu avatar Oct 08 '15 16:10 uberpu

Totally posted .. if you have the time... On Oct 6, 2015 10:34 PM, "Travis Young" [email protected] wrote:

Surely.. when i get back in i will send. It seems rather random that set... --optimize-text 1 ... tried --space-as-offset 1 ... will --process-outline 0 ... tried both ways --process-form 1 ... a constant --font-size-multiplier 10 ????

Its simply that the javascript doesn't reassign position. Is there some sort of exclusion?? (Pre-optimisation === evil) On Oct 6, 2015 9:54 AM, "Simon Chenard" [email protected] wrote:

Hm, well first off I haven't tested this feature on windows, that might be part of the problem. What if you try with theses parameters :

--optimize-text 1 --space-as-offset 1 --process-outline 0 --process-form 1 --font-size-multiplier 10

Otherwise could you find a way to share with us this document / an excerpt?

— Reply to this email directly or view it on GitHub https://github.com/coolwanglu/pdf2htmlEX/issues/250#issuecomment-145862915 .

uberpu avatar Oct 10 '15 01:10 uberpu

I have basically the same issue.

I'm running this version of pdf2htmlEX on latest OS X:

pdf2htmlEX version 0.13.6
Copyright 2012-2014 Lu Wang <[email protected]> and other contributors
Libraries:
  poppler 0.40.0
  libfontforge 20160113
  cairo 1.14.6
Default data-dir: /usr/local/Cellar/pdf2htmlex/0.13.6_7/share/pdf2htmlEX
Supported image format: png jpg svg

I want to be able to target the fields that were fill-able in the PDF file somehow. As input fields in the source PDF, they had unique identifiers. I want the outputted HTML from this project to preserve those identifiers somehow, perhaps as ids. I think I'd want them to be actual HTML input fields, but simply retaining the ids would probably be sufficient.

I'd appreciate any help you could provide.

kris-luminar avatar Jan 20 '16 20:01 kris-luminar

I'm also interested in this, having the same issue with the fields positioning to 0px.

ezsper avatar Aug 01 '16 13:08 ezsper

Run into the same issue. Trying to investigate @uberpu comment about JS not assigning positions.

gooooer avatar Nov 30 '16 20:11 gooooer

If you solve the problem, could you share your solution or example?

nccheckcashing avatar May 12 '17 21:05 nccheckcashing

Let me find one. I have been getting around it by some really twisted shell scripting, but would much rather have this just do.

I have some reader scripts if you need, but they use iTextSharp. Speaking of which, whats the best C/C++ library to use?

On May 12, 2017 5:08 PM, "Kevin Ko" [email protected] wrote:

If you solve the problem, could you share your solution or example?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/coolwanglu/pdf2htmlEX/issues/250#issuecomment-301186320, or mute the thread https://github.com/notifications/unsubscribe-auth/AAvJ3G0K3tLJCpi2PMCOzLkd8o2IzngSks5r5MoygaJpZM4BP6Fb .

uberpu avatar May 17 '17 12:05 uberpu

@uberpu Thank you for your reply. I'd like to get your reader scripts, please share it with me via github or email. I'm using ITextSharp in the server side:)

nccheckcashing avatar May 22 '17 13:05 nccheckcashing