atom-pdf-view icon indicating copy to clipboard operation
atom-pdf-view copied to clipboard

Text selection and copy/paste support

Open izuzak opened this issue 10 years ago • 8 comments

Currently, it's possible to view and scroll the PDF only. It might be useful to have the ability to select and copy/paste text.

PDF.js supports this with an overlay on top of the canvas used for rendering (as shown in the demo), so it should be possible to make it work for Atom as well. :hammer: :scissors: :fork_and_knife:

izuzak avatar May 05 '14 14:05 izuzak

+1 This would be very handy when writing excerpts, taking notes or quoting.

There is a StackOverflow Q&A where someone added the example for this exact use case to the PDF.js repo, did a blog post on it and posted a jsfiddle. So that seems to be best possible starting point for looking into this.

Bengt avatar Nov 16 '14 15:11 Bengt

The differences between the hello world example and the text selection example are actually way smaller. The basic idea is to overlay the canvas the pdf gets rendered into with a div in that the selections get rendered into. These will both get updated via promises when the page is changed.

Addition in index.html:

        <script src="../../web/ui_utils.js"></script>
        <script src="../../web/text_layer_builder.js"></script>

Addition in minimal.js:

    var textLayerDiv = document.createElement('div');
    textLayerDiv.className = 'textLayer';
    textLayerDiv.style.width = canvas.style.width;
    textLayerDiv.style.height = canvas.style.height;
    pdfPage.appendChild(textLayerDiv);

    var textLayerPromise = page.getTextContent().then(function (textContent) {
      var textLayerBuilder = new TextLayerBuilder({
        textLayerDiv: textLayerDiv,
        viewport: viewport,
        pageIndex: 0
      });
      textLayerBuilder.setTextContent(textContent);
    });

    return Promise.all([renderTask.promise, textLayerPromise]);

Addition in minimal.css:

.pdfPage {
    position: relative;
    overflow: visible;
    border: 1px solid #000000;
}

.pdfPage > canvas {
    position: absolute;
    top: 0;
    left: 0;
}

::selection { background:rgba(0,0,255,0.3); }
::-moz-selection { background:rgba(0,0,255,0.3); }

.textLayer {
    position: absolute;
    left: 0;
    top: 0;
    right: 0;
    bottom: 0;
    color: #000;
    font-family: sans-serif;
    overflow: hidden;
}

.textLayer > div {
    color: transparent;
    position: absolute;
    line-height: 1;
    white-space: pre;
    cursor: text;
}

The examples are a bit confusing because hello world creates the canvas in HTML and the text selection example in javascript. Also, the text selection example uses three wrapper functions that basically do the same thing as in hello world.

Bengt avatar Nov 16 '14 16:11 Bengt

Thanks for investigating this, @Bengt! I won't have time to tackle this myself in the near future, but if you'd like to give it a try -- that would be awesome! :sparkles:

izuzak avatar Nov 19 '14 16:11 izuzak

The text content that pdfjs returns is very fragmented - a single sentence, for example, can be divided into multiple objects depending on character size, fonts and other formatting attributes.

Apparently the way Mozilla deals with this is just creating div elements for each of the object, so that collectively the text layer still resembles the contents. While this would work for copying to Microsoft Word or TextEdit, it behaves abnormally in Atom as all text contents are aggregated into one single line, whitespaces and newlines mysteriously lost... Mozilla's demo exhibits the same problem, and I'm still trying to find a solution.

So far I've only been able to get the text layer working. Copying any content from Atom to external text editor works with formatting preserved, but not within Atom if the destination is a TextEditor instance.

If needed, here's link to my fork.

mar29th avatar Nov 06 '15 03:11 mar29th

@lafickens Thanks for those details :zap: -- I wasn't aware of that. If you continue working on this and reach a point you think would be useful to other users -- please feel free to open a pull request. We can perhaps add a config setting to enable/disable this feature so that users can enable it in case they really need to copy text.

izuzak avatar Nov 08 '15 16:11 izuzak

what are news about Text selection and copy/paste support?

Cherkah avatar Aug 19 '16 06:08 Cherkah

what are news about Text selection and copy/paste support?

@Cherkah No news. If you'd like to work on this -- please feel free to open a pull request and I'd be happy to review it. I probably won't have time to work on it anytime soon.

izuzak avatar Aug 19 '16 06:08 izuzak

Still not implemented ? It is opened in 2014 and seven years later not implemented ?

freebrowser1 avatar May 25 '21 08:05 freebrowser1