pygments icon indicating copy to clipboard operation
pygments copied to clipboard

Feature request: Allow specification of arbitrary line numbers

Open thatlittleboy opened this issue 2 years ago • 2 comments

Overview of feature request

I'm raising this request in the context of the HTML formatter provided by pygments. Unsure if this applies to other formatters as well.

As of now, the pygments HtmlFormatter provides a couple of options related to attaching line numbers to the formatted code blocks, namely linenos (False, True, "table", "inline"), hl_lines, linenostart, linenostep (with linenospecial, anchorlinenos tangentially related).
linenostart allows one to start from the specified number, not necessarily from 1, and linenostep allows one to specify line numbers allows us to print every nth line in the line number column.

However, sometimes, we might want to print arbitrary lines (not necessarily stepped) from the source code, so the request here is to expose one more option, say linenoprint which specifies a list of line numbers that will be printed from the provided source. See API Details below.

Motivation

mkdocstrings uses pygments (indirectly) when displaying source code in the generated documentation, we want to be able to expose an option to remove docstrings from the source code. See discussion in https://github.com/mkdocstrings/mkdocstrings/issues/249#issuecomment-1399264566 for more details.

The intention is to remove the docstrings from the source before passing it into pygments, but in order to still present the code as a contiguous code-block, we need support for specifying arbitrary line numbers. As opposed to assuming line numbers are always sequential.

API Details

This is just my proposal:

  • Implemented behaviour described below should be available for both table and inline lineno styles. If linenos is not provided or is False, then linenoprint should be ignored.
  • linenoprint should be mutually exclusive with linenostart. Throw an error if both are provided.
  • Normal Behaviour:
    from pygments import highlight
    from pygments.lexers import PythonLexer
    from pygments.formatters import HtmlFormatter
    code = """\
    def foo(x) -> None:
        print(x)
    """
    highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50, 52]))
    
    after formatting, should return an output that looks like
    50 | def foo(x) -> None:
    52 |     print(x)
    
  • If the provided linenoprint list is shorter / longer than the number of lines in the source, I don't really have a preference on what pygments should do. If I had to decide, I'ld be more permissive, that is:
    • if len(linenoprint) < len(source), do an equivalent of zip_longest and pad linenoprint with ""
      highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50,]))
      
      can probably give
      50 | def foo(x) -> None:
         |     print(x)
      
    • if len(linenoprint) > len(source), then do an equivalent of zip(strict=False) and ignore the rest of linenoprint.
      highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50, 52, 99]))
      
      can probably give
      50 | def foo(x) -> None:
      52 |     print(x)
      
  • Using linenoprint should be independent from the other line number options (e.g. linenostep, hl_lines), E.g.
    highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50, 52], linenostep=4))
    
    after formatting, should return an output that looks like
       | def foo(x) -> None:
    52 |     print(x)
    
    If there are other potential conflicts that I missed, we can discuss how to deal with them below.

Additional, optional, good-to-have

It would be nice if we don't have to strictly require the line numbering to be all integers, so that we might be able to format something like this

code = """\
def foo(x) -> None:
    …
    print(x)
"""
highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50, "…", 52]))  # shouldn't throw an error

after formatting, should return an output that looks like

50 | def foo(x) -> None:
…  |     …
52 |     print(x)

to signify that we've truncated some code off.

thatlittleboy avatar Jan 22 '23 04:01 thatlittleboy

If the provided linenoprint list is shorter / longer than the number of lines in the source, I don't really have a preference on what pygments should do. If I had to decide, I'ld be more permissive, that is: if len(linenoprint) < len(source), do an equivalent of zip_longest and pad linenoprint with "" highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50,])) can probably give 50 | def foo(x) -> None: | print(x) if len(linenoprint) > len(source), then do an equivalent of zip(strict=False) and ignore the rest of linenoprint. highlight(code, PythonLexer(), HtmlFormatter(linenos="table", linenoprint=[50, 52, 99])) can probably give 50 | def foo(x) -> None: 52 | print(x)

“Errors should never pass silently.”

jeanas avatar Jan 22 '23 08:01 jeanas

“Errors should never pass silently.”

Yep, in general, I agree. I thought maybe you'ld want to be consistent with the behaviour of the hl_lines option. Passing in a line number that doesn't exist in the source for hl_lines or passing a string to hl_lines -- both of these are silently ignored.

code = """\
def foo(x) -> None:
    print(x)
"""
highlight(code, PythonLexer(), HtmlFormatter(linenos="table", hl_lines=[1, 10, "abc"]))  # no error thrown by pygments

Feel free to throw an error in the aforementioned case (length of linenoprint too short or too long) then: like I mentioned, I have no strong preference.

thatlittleboy avatar Jan 22 '23 08:01 thatlittleboy