vue-pdf icon indicating copy to clipboard operation
vue-pdf copied to clipboard

Cross-line text cannot be highlighted.

Open devyujie opened this issue 11 months ago • 6 comments

Additional context

  • vue-pdf: [1.11.3]
  • vue: [3.5.12]

Hello, I need some help. when I deal with pdf containing Chinese, some text is displayed across the lines, this cross-line text can not be highlighted!

image image

Test text: ['「清晰」', '效率的提升','不确定', '明确信息层级导向', '产品操', '品牌信赖感品牌的一致性是务']

Test pdf: test_new.pdf

devyujie avatar Jan 16 '25 09:01 devyujie

HI @devyujie,

vue-pdf is a wrapper around the mozilla PDF.js project. Can you try rendering your pdf in plain js to see if it is an issue with it there? Mozilla has some examples with js fiddles that you can use: https://mozilla.github.io/pdf.js/examples/

vordimous avatar Jan 17 '25 17:01 vordimous

HI @devyujie,  你好,

vue-pdf is a wrapper around the mozilla PDF.js project. Can you try rendering your pdf in plain js to see if it is an issue with it there? Mozilla has some examples with js fiddles that you can use: https://mozilla.github.io/pdf.js/examples/vue-pdf

Thanks. Can this library call iframe.contentWindow.PDFViewerApplication.findBar api? My idea was to call the pdfjs findbar

devyujie avatar Jan 20 '25 06:01 devyujie

Can this library call iframe.contentWindow.PDFViewerApplication.findBar API?

vue-pdf doesn't use the PDFViewerApplication to render the PDF. @TaTo30 please correct me if I am wrong. So the findBar API is not available.

In the sample you provided, are you using the highlight-text property?

vordimous avatar Jan 20 '25 16:01 vordimous

PDFViewerApplication is an API used specifically by Mozilla Viewer and is not available in pdfjs-dist: https://github.com/mozilla/pdf.js/issues/9210

The problem highlighting this text: ''明确信息层级导向" is that pdf.js does not include the last char ("向") as part of the first line but does it in the second line:

Image

We had a similar issue with latin alphabet: #125 but in that case was easy to determine when a word is being "broken" looking for hyphen symbol at the end of line. How is determined in chinese that a word or phrase is being broken?

TaTo30 avatar Jan 20 '25 18:01 TaTo30

PDFViewerApplication is an API used specifically by Mozilla Viewer and is not available in pdfjs-dist: mozilla/pdf.js#9210

The problem highlighting this text: ''明确信息层级导向" is that pdf.js does not include the last char ("向") as part of the first line but does it in the second line:

Image

We had a similar issue with latin alphabet: #125 but in that case was easy to determine when a word is being "broken" looking for hyphen symbol at the end of line. How is determined in chinese that a word or phrase is being broken?

Chinese line breaks do not seem to have any sign ... Then I tried another program, the use of pdfjs findbar, can be achieved by highlighting multiple keywords, but there is a new problem, there are multiple keywords to be highlighted in a line, only one of them can be highlighted.

devyujie avatar Jan 21 '25 07:01 devyujie

Can this library call iframe.contentWindow.PDFViewerApplication.findBar API?

vue-pdf doesn't use the PDFViewerApplication to render the PDF. @TaTo30 please correct me if I am wrong. So the findBar API is not available.

In the sample you provided, are you using the highlight-text property?

I used the highlight-text property

devyujie avatar Jan 21 '25 07:01 devyujie