pypdf icon indicating copy to clipboard operation
pypdf copied to clipboard

content_page.merge_page(image_page) fails with transformed content_page

Open MartinThoma opened this issue 1 year ago • 1 comments

Replace this: What happened? What were you trying to achieve?

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.4.0-125-generic-x86_64-with-glibc2.31

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.10.4-main

Code + PDF

This is a minimal, complete example that shows the issue:

from pathlib import Path
from typing import Union, Literal, List

from PyPDF2 import PdfWriter, PdfReader, Transformation


def stamp(
    content_pdf: Path,
    stamp_pdf: Path,
    pdf_result: Path,
    page_indices: Union[Literal["ALL"], List[int]] = "ALL",
):
    reader = PdfReader(stamp_pdf)
    image_page = reader.pages[0]


    writer = PdfWriter()

    reader = PdfReader(content_pdf)
    if page_indices == "ALL":
        page_indices = list(range(0, len(reader.pages)))
    for index in page_indices:
        content_page = reader.pages[index]
        content_page.add_transformation(Transformation().rotate(90))
        mediabox = content_page.mediabox
        content_page.merge_page(image_page)
        content_page.mediabox = mediabox
        writer.add_page(content_page)

    with open(pdf_result, "wb") as fp:
        writer.write(fp)

stamp("page.pdf", "stamp.pdf", "out.pdf")

The input files:

The output PDF:

I would have expected something like this:

MartinThoma avatar Aug 29 '22 17:08 MartinThoma

page.pdf is a PDF file landscape layout, no "/Rotate" attribute. the same for ip. the coordinate origin is therefore at the bottom left for both. To get this expected result there is no need to transform any of them: with such function

def stamp(
    content_pdf: Path,
    stamp_pdf: Path,
    pdf_result: Path,
    page_indices: Union[Literal["ALL"], List[int]] = "ALL",
):
    reader = PdfReader(stamp_pdf)
    image_page = reader.pages[0]


    writer = PdfWriter()

    reader = PdfReader(content_pdf)
    if page_indices == "ALL":
        page_indices = list(range(0, len(reader.pages)))
    for index in page_indices:
        content_page = reader.pages[index]
        #content_page.add_transformation(Transformation().rotate(90))
        mediabox = content_page.mediabox
        content_page.merge_page(image_page)
        content_page.mediabox = mediabox
        writer.add_page(content_page)

    with open(pdf_result, "wb") as fp:
        writer.write(fp)

the result I've got is the following: iss1301out.pdf

issue may be closed

pubpub-zz avatar Sep 10 '22 14:09 pubpub-zz