bat icon indicating copy to clipboard operation
bat copied to clipboard

Command line option `--language csharp` will cause an error

Open kojix2 opened this issue 1 year ago • 6 comments

What steps will reproduce the bug?

echo "using System;" | bat -l csharp
[bat error]: unknown syntax: 'csharp'

What happens?

You cannot specify csharp as a language option.

What did you expect to happen instead?

Let's specify csharp in the markdown code block.

echo "
\`\`\`csharp
using System;
\`\`\`
" | bat -l markdown

Syntax highlighting is performed as expected.

───────┬────────────────────────────────────────────────────────────────────────
       │ STDIN
───────┼────────────────────────────────────────────────────────────────────────
   1   │ 
   2   │ ```csharp
   3   │ using System;
   4   │ ```
   5   │ 
───────┴────────────────────────────────────────────────────────────────────────

Screenshot from 2023-11-16 20-51-56

How did you install bat?

cargo install bat

bat version and environment

Software version

bat 0.24.0

Operating system

Linux 6.2.0-36-generic

Command-line

bat -l csharp --diagnostic 

Environment variables

SHELL=/bin/bash
PAGER=<not set>
LESS=<not set>
LANG=ja_JP.UTF-8
LC_ALL=<not set>
BAT_PAGER=<not set>
BAT_PAGING=<not set>
BAT_CACHE_PATH=<not set>
BAT_CONFIG_PATH=<not set>
BAT_OPTS=<not set>
BAT_STYLE=<not set>
BAT_TABS=<not set>
BAT_THEME=<not set>
XDG_CONFIG_HOME=<not set>
XDG_CACHE_HOME=<not set>
COLORTERM=truecolor
NO_COLOR=<not set>
MANPAGER=<not set>

System Config file

Could not read contents of '/etc/bat/config': No such file or directory (os error 2).

Config file

Could not read contents of '/home/kojix2/.config/bat/config': No such file or directory (os error 2).

Custom assets metadata

Could not read contents of '/home/kojix2/.cache/bat/metadata.yaml': No such file or directory (os error 2).

Custom assets

'/home/kojix2/.cache/bat' not found

Compile time information

  • Profile: release
  • Target triple: x86_64-unknown-linux-gnu
  • Family: unix
  • OS: linux
  • Architecture: x86_64
  • Pointer width: 64
  • Endian: little
  • CPU features: fxsr,sse,sse2
  • Host: x86_64-unknown-linux-gnu

Less version

> less --version 
less 590 (GNU regular expressions)
Copyright (C) 1984-2021  Mark Nudelman

less comes with NO WARRANTY, to the extent permitted by law.
For information about the terms of redistribution,
see the file named README in the less distribution.
Home page: https://greenwoodsoftware.com/less

kojix2 avatar Nov 16 '23 11:11 kojix2

Markdown allows to use csharp as a language specifier token, but that's because it has a special rule for it. I believe bat by default expects you to use C# to refer to the CSharp syntax. I'm not at a computer right now to verify, but try

echo "using System;" | bat -l 'C#'

EDIT: confirmed that the above works. I agree that it would be nice if the bat language argument on the command line would also work with the Markdown language tokens, but that would take quite a lot of re-working/duplication and maintenance and in some cases where ambiguity could ensue, could cause confusion.

Perhaps a simpler solution would be (if you never look at Java code for example), to update the first_line_match for the C#.sublime-syntax to match on using System;, then you wouldn't even need the language specifier. (See https://github.com/sharkdp/bat/blob/master/README.md#adding-new-syntaxes--language-definitions for details on how to do that.)

keith-hall avatar Nov 16 '23 13:11 keith-hall

Another idea is that we could extend --map-syntax to work with -l/--language.

I'm not sure what the best way to approach it would be, since --map-syntax affects filename-based matchings, but we have a couple of options:

  • Match based on the pattern, e.g. --map-syntax "csharp:C#"

    • Pro: Re-uses the code for syntax mapping.
    • Con: Will cause any files named csharp to use the C# syntax.
    • Con: Causes wildcard patterns to be valid languages for --language.
  • Match based on the pattern, but only if the pattern does not contain any wildcards.

    • Pro: Re-uses the code for syntax mapping.
    • Con: Will cause any files named csharp to use the C# syntax.
  • Introduce a new option called --map-language with the same syntax.

    • Pro: It doesn't interfere with syntax mapping.
    • Con: The name will cause confusion with --map-syntax.
    • Con: It's more code.
  • Introduce a new format for --map-syntax, with = as a delimiter instead of : (e.g. `--map-syntax "csharp=C#")

    • Pro: It doesn't interfere with syntax mapping.
    • Con: It's more code.

I'm leaning towards the last approach. Thoughts, @sharkdp, @keith-hall, @Enselic?

eth-p avatar Feb 08 '24 04:02 eth-p

If I understand correctly the last approach is essentially registering an alias, right? I'm curious if there are any other use cases than for C# - where the language is referred to by multiple similar names. Because if it's just C# for example we could also bundle another sublime-syntax file called Csharp which would just include the main C#.sublime-syntax. I believe it would also solve this concrete problem (due to case insensitive matching) without needing any Rust code changes...

keith-hall avatar Feb 10 '24 06:02 keith-hall

I would say sharkdp and keith-hall are the experts when it comes to syntax mapping, and I 100% trust their judgements in these matters.

Enselic avatar Feb 10 '24 06:02 Enselic

@keith-hall it would just be an alias, yeah. I'll defer to your and sharkdp's judgement for the best approach to take with this for the same reasons Enselic mentioned, though.

eth-p avatar Feb 10 '24 08:02 eth-p

I think @eth-p's map-syntax = proposal gives most flexibility. After considering my idea a little, I realize that I don't like the idea of us manually aliasing and maintaining lots of sublime-syntax files - probably my suggestion was more intended as a one-off which users can easily apply themselves. Whereas with some code additions, we can more easily cater for other aliases and better ensure the Markdown codefence language names can also be used directly with bat. 👍

keith-hall avatar Feb 10 '24 21:02 keith-hall