pdf2docx icon indicating copy to clipboard operation
pdf2docx copied to clipboard

Conversion very irregular and out of format

Open HassanRaza1313 opened this issue 4 years ago • 2 comments
trafficstars

Please see the attached .pdf file and the resulting .docx file. The format becomes split and very weird. generated.docx out.pdf

HassanRaza1313 avatar Oct 04 '21 11:10 HassanRaza1313

@HassanRaza1313 I also faced this issue. Have you resolved this issue using any module/api in python?

@dothinking Can you please comment on this issue?

yugaljain1999 avatar Jan 16 '22 19:01 yugaljain1999

I worked around this issue by adding something like the following after conversion. It seemed like the paragraph space before / space after was the culprit.

  # Adjust Paragraph Space Before / Space After
  for paragraph in document.paragraphs:
      paragraph.paragraph_format.line_spacing_rule = WD_LINE_SPACING.SINGLE
      space_before = paragraph.paragraph_format.space_before
      if space_before and space_before.pt > 12:
          paragraph.paragraph_format.space_before = Pt(12)
      space_after = paragraph.paragraph_format.space_after
      if space_after and space_after.pt > 12:
          paragraph.paragraph_format.space_before = Pt(12)

jamespjarvis avatar May 03 '22 15:05 jamespjarvis