react-pdf icon indicating copy to clipboard operation
react-pdf copied to clipboard

Text layer may contain overlapping areas (react-pdf 9.0.0)

Open obecker opened this issue 3 weeks ago • 1 comments

Before you start - checklist

  • [X] I followed instructions in documentation written for my React-PDF version
  • [X] I have checked if this bug is not already reported
  • [X] I have checked if an issue is not listed in Known issues
  • [X] If I have a problem with PDF rendering, I checked if my PDF renders properly in PDF.js demo

Description

After upgrading react-pdf from 8.0.2 to 9.0.0 I observed that consecutive spans in the same line within the text layer may overlap (i.e. the spans are too wide). This prevents the correct selection of text in the document.

This is an example from the provided sample.pdf (page 2, penultimate paragraph):

Bildschirmfoto 2024-06-11 um 12 57 07

You can see the overlapping area at the word "bibendum".

Now, while I supposed that this must be something in the core pdf.js library, I am unable to reproduce the behavior in the pdf.js demo. I even downloaded the latest (4.3.136) release from https://github.com/mozilla/pdf.js/releases, ran npx serve in the extracted folder, and opened web/viewer.html with the sample.pdf - the issue is not there.

If you want to test it with a different PDF, try https://www.vbg.de/cms/_Resources/Persistent/7/0/d/c/70dc78bec739e6cbe27bc8ba77a16d15347461d7/M_Arzt_Anforderungen.pdf and here the last list item on the first page ("über Kenntnisse in der erforderlichen Röntgentechnik und Röntgendiagnostik verfügen.")

Steps to reproduce

Run yarn run dev in sample/create-react-app-5, scroll to page 2 and select the first line of the penultimate paragraph.

Try to select and copy the word `justo'

Expected behavior

The selected areas don't overlap. The word 'justo' gets copied.

Actual behavior

They do overlap. The copied text is 'utat'

Additional information

No response

Environment

  • Browser (if applicable): Firefox 126.0.1, latest Chrome, Opera, MS Edge (all macOS)
  • React-PDF version: 9.0.0
  • React version: 18.2.0
  • Webpack version (if applicable):

obecker avatar Jun 11 '24 11:06 obecker