pypdf icon indicating copy to clipboard operation
pypdf copied to clipboard

merge_page with passed page having markup annotation fails

Open stefan6419846 opened this issue 3 months ago • 1 comments

PageObject.merge_page does not work when the passed parameter contains a markup annotation. The reasons is that we internally use DictionaryObject.clone as the base class for these annotations, which assumes that creating new instances of the corresponding classes does not take parameters, which is not the case here.

In the same step, it currently is not clear for me how passing the corresponding attributes actually works at the moment.

To mitigate this, we would ideally have a generic solution, possibly by having a mapping in each class derived from DictionaryObject which maps keys of self to parameters of __init__ to avoid code duplication.

Initially discovered in https://github.com/py-pdf/pypdf/pull/3291#issuecomment-3304145646.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.4.0-150600.23.65-default-x86_64-with-glibc2.38

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==6.0.0, crypt_provider=('cryptography', '44.0.0'), PIL=11.1.0

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfWriter
from pypdf.annotations import Polygon


writer = PdfWriter()
writer2 = PdfWriter()
writer.add_blank_page(100, 100)
writer2.add_blank_page(100, 100)

annotation = Polygon(
    vertices=[(50, 550), (200, 650), (70, 750), (50, 700)],
)
writer.add_annotation(0, annotation)

page1 = writer.pages[0]
page2 = writer2.pages[0]
page2.merge_page(page1)

No PDF file required, as it will created on the fly by the above code.

Traceback

This is the complete traceback I see:

tests/test_page.py:1508 (test_merge_page_with_annotations)
def test_merge_page_with_annotations():
        writer = PdfWriter()
        writer2 = PdfWriter()
        writer.add_blank_page(100, 100)
        writer2.add_blank_page(100, 100)
    
        from pypdf.annotations import Polygon
        annotation = Polygon(
            vertices=[(50, 550), (200, 650), (70, 750), (50, 700)],
        )
        writer.add_annotation(0, annotation)
    
        page_one = writer.pages[0]
        page_two = writer2.pages[0]
>       page_two.merge_page(page_one)

tests/test_page.py:1523: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pypdf/_page.py:1062: in merge_page
    self._merge_page(page2, over=over, expand=expand)
pypdf/_page.py:1080: in _merge_page
    return self._merge_page_writer(
pypdf/_page.py:1235: in _merge_page_writer
    aa = a.clone(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = {'/Type': '/Annot', '/Subtype': '/Polygon', '/Vertices': [50, 550, 200, 650, 70, 750, 50, 700], '/IT': '/PolygonCloud', '/Rect': RectangleObject([50, 550, 200, 750]), '/P': IndirectObject(4, 0, 140694769602768)}
pdf_dest = <pypdf._writer.PdfWriter object at 0x7ff60dc3a310>
force_duplicate = True, ignore_fields = ('/P', '/StructParent', '/Parent')

    def clone(
        self,
        pdf_dest: PdfWriterProtocol,
        force_duplicate: bool = False,
        ignore_fields: Optional[Sequence[Union[str, int]]] = (),
    ) -> "DictionaryObject":
        """Clone object into pdf_dest."""
        try:
            if self.indirect_reference.pdf == pdf_dest and not force_duplicate:  # type: ignore
                return self
        except Exception:
            pass
    
        visited: set[tuple[int, int]] = set()  # (idnum, generation)
        print(type(self), self)
        d__ = cast(
            "DictionaryObject",
>           self._reference_clone(self.__class__(), pdf_dest, force_duplicate),
        )
E       TypeError: Polygon.__init__() missing 1 required positional argument: 'vertices'

pypdf/generic/_data_structures.py:297: TypeError

stefan6419846 avatar Sep 17 '25 18:09 stefan6419846

hello, i would like to try to solve this. i think i may have a solution that is generic enough. i will open pull request when it is ready

HSY-999 avatar Nov 20 '25 21:11 HSY-999