sphinx-copybutton icon indicating copy to clipboard operation
sphinx-copybutton copied to clipboard

emphasize-lines on code-block and literalinclude adds extra CR

Open userlerueda opened this issue 4 years ago • 3 comments

Describe the bug

When we copy a code block with emphasize-lines it adds an additional carrier return to the copied text

To Reproduce

Steps to reproduce the behavior:

  1. Create a sphinx directive like this one:
.. code-block::
    :emphasize-lines: 1

    [cisco@centos]$ docker exec -it cl_app_1 bash
    root@5109156083d2:/app# 
  1. Create documentation
  2. Click on copy button
  3. Paste text in an editor and see the extra line in between line 1 and 2

Expected behavior

Text should be copied as is

Environment

  • Python Version: 3.9.2
  • Sphinx v3.5.1
  • Operating System: 11.2.3

Additional context

Add any other context about the problem here.

userlerueda avatar Mar 18 '21 17:03 userlerueda

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

welcome[bot] avatar Mar 18 '21 17:03 welcome[bot]

I can confirm this issue. Here are some code boxes with emphazised lines where this can be tested:

  • Example: https://3dcitydb-docs.readthedocs.io/en/latest/impexp/docker.html#importer-exporter-docker-combined-with-3dcitydb-docker
  • Source: https://github.com/3dcitydb/3dcitydb-docs/tree/master/source

BWibo avatar Jan 19 '22 16:01 BWibo

tl;dr

  1. Highlights/emphases render as spans which domElementInstance.innerText seems to convert into extra newlines
  2. I can confirm @BWibo's project is still affected
  3. Arcade's development branch no longer seems to be affected
  4. Please help figure out why; I've compiled what I've found so far below
  5. Two potential fixes: textContent & making text extraction the plugin user's problem

How this bug seems to happen

  1. Manually copying the affected text never triggers the behavior

  2. Each highlight/emphasis is rendered as a <span class='.hll'> around the original line image image image image

  3. The copy button appears to convert .hll spans into extra newlines, specifically whenever target.innerText is used here:

var copyTargetText = (trigger) => {
  var target = document.querySelector(trigger.attributes['data-clipboard-target'].value);
  return formatCopyText(target.innerText, '', false, true, true, true, '', '')
}

Unclear why arcade isn't affected anymore

Arcade was affected as of 2.6.17 and earlier. I don't know why the tip of development no longer appears to be affected. Upgrading sphinx-copybutton from 0.5.0 to 0.5.1 cannot be the only factor as I confirmed 3dcitydb-docs is still affected by building & testing it locally.

Here are the most promising changes to arcade's setup.py between the 2.6.17 release and the tip of our development branch as of the time of writing:

 REQUIREMENTS_DOCS = [

-    "Sphinx==4.5.0",
-    "Pygments==2.10.0",
-    "sphinx-copybutton==0.5.0",
-    "sphinx-sitemap==2.2.0",
+    "Sphinx==6.0.0",
+    "Pygments==2.14.0",
+    "sphinx-copybutton==0.5.1",
+    "sphinx-sitemap==2.4.0",
     "dirsync==2.2.5",
     "pyyaml==6.0",
-    "docutils<0.18",
+    "docutils==0.19",

Two potential fixes

I haven't thoroughly verified these yet & welcome feedback on them.

Use textContent instead of innerText?

domNode.innerText seems to treat <br>s and the ends of spans as newlines. For simple codeblock usecases, replacing it with domNode.textContent seems like it should work. A spot-check of this from my browser's debugger outputs newlines that look correct.

The downside is that it will break rich text because it returns the original text of the nodes before styling, as well as the inner text of script and style nodes.

Maybe we could have the best of both worlds by putting it behind a config flag?

Let users pass in their own text extraction function?

If you let people define their own parsing functions, you get the following benefits:

  1. The plugin could theoretically be compatible with just about any content area
  2. Configuration could be easier / JS functions could take fewer arguments: https://github.com/executablebooks/sphinx-copybutton/blob/afefeefc31907e9f283e82969243a954da7cd1af/sphinx_copybutton/_static/copybutton_funcs.js#L23
  3. Debugging & submitting PRs to this project would be easier due to being able to prototype replacements from within a downstream project

I know this seems incredibly lazy on my part, but it seems worth the risks at first glance.

pushfoo avatar Feb 10 '23 05:02 pushfoo