bat icon indicating copy to clipboard operation
bat copied to clipboard

no syntax highlighting for symlinked file

Open sumanthratna opened this issue 4 years ago • 23 comments

See https://github.com/sharkdp/bat/issues/984#issuecomment-627998442.

What version of bat are you using? bat 0.15.1

Describe the bug you encountered: bat ~/.ssh/config is showing the contents of my ssh_config in all white. This might be because ~/.ssh/config is symlinked (ln -s) to a file in my dotfiles repo.

Describe what you expected to happen? bat ~/.ssh/config should highlight the contents of the file according to https://github.com/robballou/sublimetext-sshconfig.

How did you install bat? Homebrew


system

$ uname -srm Darwin 19.5.0 x86_64

$ sw_vers ProductName: Mac OS X
ProductVersion: 10.15.5
BuildVersion: 19F83c

bat

$ bat --version bat 0.15.1

$ env

bat_config

bat_wrapper

No wrapper script for 'bat'.

bat_wrapper_function

No wrapper function for 'bat'.

No wrapper function for 'cat'.

tool

$ less --version less 487 (POSIX regular expressions)

sumanthratna avatar May 15 '20 13:05 sumanthratna

Thank you for reporting this!

bat ~/.ssh/config is showing the contents of my ssh_config in all white. This might be because ~/.ssh/config is symlinked (ln -s) to a file in my dotfiles repo.

That is indeed a bug. I can confirm this.

The culprit is this line where we canonicalize the path, which resolves symlinks:

https://github.com/sharkdp/bat/blob/c2c70707125720619f7dd7d2ad6001a07cf97df6/src/assets.rs#L207-L208

The reason I used .canonicalize() was to solve situations like cd /etc; bat fstab, where we would have to resolve the relative fstab path to an absolute /etc/fstab path. Similarly, if someone is inside $USER/.ssh, calling bat config should also map to the SSH config syntax.

In order to solve this, I guess we should first look up the non-canonicalized path in the syntax mapping. If this does not lead to any results, we look up the canonicalized path as well. But we should think about this more carefully. And probably merge #985 first, before we attempt to solve this.

sharkdp avatar May 15 '20 15:05 sharkdp

In order to solve this, I guess we should first look up the non-canonicalized path in the syntax mapping. If this does not lead to any results, we look up the canonicalized path as well. But we should think about this more carefully.

This is a tough one, because we have three possible path names:

  • The canonical path.
  • The absolute path.
  • The relative path (provided as-is)

The correct highlighting really depends on the context. If it's a symlinked config file like this issue, it should be the absolute path. If it's a symlinked file that doesn't have an absolute path syntax mapping, it should probably be the canonical path because that's what the file actually is. And if that doesn't have a mapping, should it fall back to the relative path/filename?

Unless we have a way to determine if a syntax provides an absolute-path mapping for a file, I think the safest solution would probably be to look it up in order of absolute, canonical, and then relative/filename.

And probably merge #985 first, before we attempt to solve this.

That would be a good idea, but I could also just add a fix for this as part of #985 if that's easier.

eth-p avatar May 15 '20 16:05 eth-p

Just to make sure that we have the same understanding of canonical/absolute/as-is path:

cd /home
bat shark/.ssh/config
  • as-is path: shark/.ssh/config
  • absolute path: /home/shark/.ssh/config (which is a symlink to the …)
  • canonical path: /home/shark/.my-config-repo/ssh_config_file

The correct highlighting really depends on the context. If it's a symlinked config file like this issue, it should be the absolute path.

:+1:

If it's a symlinked file that doesn't have an absolute path syntax mapping, it should probably be the canonical path because that's what the file actually is.

You are thinking about something like a symlink from my_script -> original_file.ext where you would like to use the extension of the original file to get some information about my_script? But what if we have a scenario script.ext1 -> script.ext2? Should we really look up ext2 or accept the fact that script.ext1 doesn't lead to any matches in the syntax mapping and then get the actual syntax for ext1?

I'd rather go with the latter and not consider the canonical path at all.

Also, do we really have to look up the (potentially relative) as-is path? What would be a case where this would be useful? Maybe it's enough to simply get the absolute path and use it as a basis for the syntax mapping.

Note: getting the "absolute path" is actually non-trivial in Rust: https://stackoverflow.com/a/54817755/704831

sharkdp avatar May 15 '20 18:05 sharkdp

Just to make sure that we have the same understanding of canonical/absolute/as-is path

Yep, it looks like we do.

You are thinking about something like a symlink from my_script -> original_file.ext where you would like to use the extension of the original file to get some information about my_script?

Something like that, yes.

But what if we have a scenario script.ext1 -> script.ext2? Should we really look up ext2 or accept the fact that script.ext1 doesn't lead to any matches in the syntax mapping and then get the actual syntax for ext1?

We could look it up order like this this:

  1. script.ext2 matches an absolute path mapping.
  2. script.ext1 matches an extension syntax.
  3. Use the first-line syntax for the file.
  4. script.ext2 matches an extension syntax.

That should cover cases like a symlinked ~/.ssh/config (1) first, then attempt to look up the syntax based on its perceived extension (2) or content type (3). If none of those are found, I think it would be reasonable to fall back to the canonical file's extension (4).

Note: getting the "absolute path" is actually non-trivial in Rust: https://stackoverflow.com/a/54817755/704831

I was looking at that earlier, actually. It's a shame it's not included in the standard library, but path-clean is a pretty small dependency that won't pull in any additional crates.

eth-p avatar May 15 '20 19:05 eth-p

We could look it up order like this this:

  1. script.ext2 matches an absolute path mapping.
  2. script.ext1 matches an extension syntax.
  3. Use the first-line syntax for the file.
  4. script.ext2 matches an extension syntax.

If the first one would be "script.ext1 matches an absolute path mapping", I would agree. That sounds like a good idea.

sharkdp avatar May 15 '20 19:05 sharkdp

Note: getting the "absolute path" is actually non-trivial in Rust: https://stackoverflow.com/a/54817755/704831

I was looking at that earlier, actually. It's a shame it's not included in the standard library, but path-clean is a pretty small dependency that won't pull in any additional crates.

In fd, we have been using this code for some time:

https://github.com/sharkdp/fd/blob/0335cc362b2c830c24b957504a7cb1f7cd623a44/src/filesystem.rs#L12-L34

It's been serving us well, but we would need to check if that works for the use case here as well. I'm not sure if all "cleanups" are performed.

sharkdp avatar May 15 '20 19:05 sharkdp

https://github.com/sharkdp/fd/blob/0335cc362b2c830c24b957504a7cb1f7cd623a44/src/filesystem.rs#L12-L34

I adapted it to work in the Rust playground, and it works for everything except parent directories: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2e03fcb23e87920e543cd54d300d207c

"/test" -> "/test"
"test" -> "/some/example/pwd/test"
"./test" -> "/some/example/pwd/test"
"../test" -> "/some/example/pwd/../test"

eth-p avatar May 15 '20 20:05 eth-p

Ok, yeah. That's probably what path-clean is for. I'm ok with adding it as a dependency if it works for Windows paths as well.

sharkdp avatar May 15 '20 20:05 sharkdp

After some testing, it doesn't seem like it handles Windows paths. It would also require a slight amount of boilerplate, too. I'm going to look into finding a crate dedicated to converting relative to absolute paths and get back to you when I find a suitable one.

Potential crates:

  • https://crates.io/crates/path-absolutize (2 subdependencies)
  • https://crates.io/crates/path_abs (4 subdependencies; offers more than we need)

eth-p avatar May 15 '20 20:05 eth-p

path-absolutize looks good to me

sharkdp avatar May 16 '20 06:05 sharkdp

I'm not sure if this is useful info for you guys, but I think bat highlights correctly for a symlinked gitconfig: .gitconfig -> ~/dotprophet/gitconfig

sumanthratna avatar May 20 '20 14:05 sumanthratna

I'm not sure if this is useful info for you guys, but I think bat highlights correctly for a symlinked gitconfig: .gitconfig -> ~/dotprophet/gitconfig

I guess it only works because the original file is also called gitconfig, which is one of the patterns:

❯ bat -L | grep gitconfig
Git Config:gitconfig,.gitconfig,.gitmodules

If you would rename the actual file, it probably won't work anymore.

sharkdp avatar May 20 '20 20:05 sharkdp

This issue should be fixed in bat v0.15.2. Feedback would be appreciated.

sharkdp avatar May 25 '20 15:05 sharkdp

~/.ssh/config -> ~/dotprophet/sshconfig:

  • when I run bat ~/.ssh/config there's syntax highlighting
  • when I run bat ~/dotprophet/sshconfig, the text is in all-white, with no syntax highlighting.

I'm on bat 0.15.2.

sumanthratna avatar May 25 '20 22:05 sumanthratna

@sumanthratna Thank you.

when I run bat ~/dotprophet/sshconfig, the text is in all-white, with no syntax highlighting.

Well, that is expected. sshconfig is not a pattern that we recognize (only .ssh/config and ssh_config are, see bat -L). If you want that file to be highlighted as well, you would have to rename it to ssh_config or add this to your bat config file:

--map-syntax="sshconfig:SSH Config"

sharkdp avatar May 26 '20 05:05 sharkdp

I see, I was misunderstanding something. I had thought that when I run bat ~/dotprophet/sshconfig, the path would get "absolutized" into /Users/suman/.ssh/config, which would be highlighted.

I think this would be a nice feature to have. If this is possible, I'll create a new issue for this feature request.

EDIT: and for the record, I just renamed ~/dotprophet/sshconfig to ~/dotprophet/ssh_config and syntax highlighting works fine.

sumanthratna avatar May 26 '20 13:05 sumanthratna

as misunderstanding something. I had thought that when I run bat ~/dotprophet/sshconfig, the path would get "absolutized" into /Users/suman/.ssh/config, which would be highlighted.

That is not possible. You can not reverse-resolve symlinks.

sharkdp avatar May 26 '20 15:05 sharkdp

I had thought that when I run bat ~/dotprophet/sshconfig, the path would get "absolutized" into /Users/suman/.ssh/config, which would be highlighted.

That is not possible. You can not reverse-resolve symlinks.

Does this not do what I'm suggesting? https://doc.rust-lang.org/std/fs/fn.read_link.html

sumanthratna avatar May 26 '20 17:05 sumanthratna

You said you have a symlink ~/.ssh/config -> ~/dotprophet/sshconfig. This means that ~/.ssh/config is a symlink to ~/dotprophet/sshconfig. You can call read_link on ~/.ssh/config to resolve the symlink "along the arrow" to get the result ~/dotprophet/sshconfig. But there is no way to go backwards.

So if you call bat ~/dotprophet/sshconfig, there is no way that we could know that there is a symlink somewhere else, on this or even on another filesystem, that points to this file. We cannot "resolve symlinks backwards".

sharkdp avatar May 26 '20 18:05 sharkdp

Ahhh yikes, you're right, I don't know why I was thinking that. :) Regardless, it looks like the original issue for this ticket has been solved.

sumanthratna avatar May 26 '20 18:05 sumanthratna

I still met this bug:

❯ \ls Aliases/0install -l
lrwxrwxrwx 1 wzy wzy 26 8月   9 20:48 Aliases/0install -> ../Formula/zero-install.rb

However, bat Aliases/0install will not use ruby's syntax highlight. This example comes from linuxbrew: /home/linuxbrew/.linuxbrew/Homebrew/Library/Taps/homebrew/homebrew-core/Aliases/0install https://github.com/Homebrew/homebrew-core/tree/master/Aliases/0install

Freed-Wu avatar Aug 14 '22 16:08 Freed-Wu

I'm able to reproduce the homebrew-core example, so, although we have fixed the original specific symlink example in the OP, I'm going to reopen this issue as it is still related.

keith-hall avatar Aug 26 '22 05:08 keith-hall

as an experiment, I was able to fix it with this monstrosity (i.e. calling canonicalize), but then it broke the original assets::tests::syntax_detection_for_symlinked_file test, so an expert will probably need to take a look:

diff --git a/src/assets.rs b/src/assets.rs
index f87d393..388c1ab 100644
--- a/src/assets.rs
+++ b/src/assets.rs
@@ -7,7 +7,7 @@ use once_cell::unsync::OnceCell;
 use syntect::highlighting::Theme;
 use syntect::parsing::{SyntaxReference, SyntaxSet};
 
-use path_abs::PathAbs;
+use path_abs::{PathAbs, PathInfo};
 
 use crate::error::*;
 use crate::input::{InputReader, OpenedInput};
@@ -271,7 +271,7 @@ impl HighlightingAssets {
         let path = input.path();
         let path_syntax = if let Some(path) = path {
             self.get_syntax_for_path(
-                PathAbs::new(path).map_or_else(|_| path.to_owned(), |p| p.as_path().to_path_buf()),
+                PathAbs::new(path).map_or_else(|_| path.to_owned(), |p| p.canonicalize().map_or_else(|_| p.as_path().to_path_buf(), |c| c.as_path().to_path_buf())),
                 mapping,
             )
         } else {

keith-hall avatar Aug 26 '22 07:08 keith-hall