rouge icon indicating copy to clipboard operation
rouge copied to clipboard

ConsoleLexer detecting prompt based on characters in the middle of a line

Open ihollander opened this issue 3 years ago • 5 comments

Name of the lexer ConsoleLexer

Code sample

require 'rouge'

console = <<~TXT
  $ echo "Hello > World"
  "Hello > World"
TXT

lexer = Rouge::Lexers::ConsoleLexer.new
formatter = Rouge::Formatters::HTMLLegacy.new
puts formatter.format(lexer.lex(console))

Additional context When using the ConsoleLexer, I expect the prompt to be detected based on characters at the start of a line. So in the example above, only the $ at the beginning of the $ echo "Hello > World" line would indicate this line as a prompt. Instead the second line "Hello > World" is also matched as a prompt because of the > character in the middle of the line. So the code above produces this output:

<div class="highlight">
  <pre class="codehilite">
    <code>
      <span class="gp">$</span><span class="w"> </span><span class="nb">echo</span><span class="s2">"Hello &gt; World"</span>

      <!-- this line should not be marked as a prompt, but it is -->
      <span class="gp">"Hello &gt;</span><span class="w"> </span>World<span class="s2">"</span>
    </code>
  </pre>
</div>

I expect this output instead:

<div class="highlight">
  <pre class="codehilite">
    <code>
      <span class="gp">$</span><span class="w"> </span><span class="nb">echo</span><span class="s2">"Hello &gt; World"</span>

      <span class="go">"Hello &gt; World"</span>
    </code>
  </pre>
</div>

It looks like this method is the culprit, and the Regex needs to be modified to detect prompt characters at the beginning of a line only:

# lib/rouge/lexers/console.rb
      def prompt_regex
        @prompt_regex ||= begin
          /^#{prompt_prefix_regex}(?:#{end_chars.map(&Regexp.method(:escape)).join('|')})/
        end
      end

ihollander avatar Aug 23 '21 13:08 ihollander

I was having a similar problem on my Jekyll blog. A few minutes of hacking at my local copy of the gem source and I came of with a simple solution.

By modifying the prompt_prefix_regexp so it must be either empty or start with a non-space character — that is, it ignores lines that start with one or more whitespace characters — you can explicitly mark your .go lines by indenting them like this:

```
$ echo "Hello > World"
  "Hello > World" some
```

This does result in the .go lines being indented in the output as well; but since the whole line will be marked as .go, you can fix that visually with a css style such as

.go { text-indent: -2ch; } // Undo a 2-character indentation of output lines in your code fence.

Or just embrace having all your .go output lines indented.

I need some more time to look for edge cases, but I can submit a pull request later this week if there's still interest.

eToThePiIPower avatar Jun 02 '22 02:06 eToThePiIPower

An easier solution might be to modify the prompt_prefix yourself, to exclude >. As an example using the Markdown plugin:

``` console?prompt=$
$ echo "Hello > World"
"Hello > World"
```

jneen avatar Jun 02 '22 04:06 jneen

In your example if you're doing it by hand you could use Rouge::Lexers::ConsoleLexer.new(prompt: '$') instead.

jneen avatar Jun 02 '22 04:06 jneen

from the console: image

jneen avatar Jun 02 '22 04:06 jneen

Here are all the options for the console lexer, documented in rougify list:

console: A generic lexer for shell sessions. Accepts ?lang and ?output lexer options, a ?prompt option, ?comments to enable # comments, and ?error to handle error messages. [aliases: terminal,shell_session,shell-session]
  ?comments= enable hash-comments at the start of a line - otherwise interpreted as a prompt. (default: false, implied by ?prompt not containing `#`)
  ?error= comma-separated list of strings that indicate the start of an error message
  ?lang= the shell language to lex (default: shell)
  ?output= the output language (default: plaintext?token=Generic.Output)
  ?prompt= comma-separated list of strings that indicate the end of a prompt. (default:$,#,>,;)

jneen avatar Jun 02 '22 04:06 jneen