shellcheck icon indicating copy to clipboard operation
shellcheck copied to clipboard

Question: Should ls -1 -b be acceptable for SC2012

Open peterjc opened this issue 1 year ago • 2 comments

For bugs

  • Rule Id (if any, e.g. SC1000): SC2012
  • My shellcheck version (shellcheck --version or "online"): online
  • [x] The rule's wiki page does not already cover this (e.g. https://www.shellcheck.net/wiki/SC2012)
  • [x] I tried on https://www.shellcheck.net/ and verified that this is still a problem on the latest commit

Here's a snippet or screenshot that shows the problem:

#!/bin/bash

#Basic example, one line per file:
ls -l ./*.txt | wc -l

#Using -1 is better (no funny date format changes etc):
ls -1 ./*.txt | wc -l

#But even -1 would still map odd characters to question mark
#so use -b which is available on macOS and Linux
ls -1 -b ./*.txt | wc -l

Here's what shellcheck currently says:

All three examples trigger SC2012 (info): Use find instead of ls to better handle non-alphanumeric filenames.

Here's what I wanted or expected to see:

Reading https://www.shellcheck.net/wiki/SC2012 highlights the risks in unexpected variation in the ls metadata output (e.g. date/time), which are avoided by using -1 although that does spoil the example searching by username.

Moreover it raises the issue that ls will map some characters to a question mark, potentially giving name collisions, and more importantly resulting in strings which are not valid filenames.

I wonder if you consider -b to be a sufficient safeguard for this?

There are other relevant flags, but this looks to be the most cross-platform (despite not being in the POSIX standard).

On macOS, there are no quoting options as on Linux. However with macOS 12.7 we have:

$ man ls
...
     -B      Force printing of non-printable characters (as defined by ctype(3)
             and current locale settings) in file names as \xxx, where xxx is the
             numeric value of the character in octal.  This option is not defined
             in IEEE Std 1003.1-2008 (“POSIX.1”).
...
     -b      As -B, but use C escape codes whenever possible.  This option is not
             defined in IEEE Std 1003.1-2008 (“POSIX.1”).
...

GNU ls has a different meaning for -B, but does offer other relevant options:

$ ls --version | head -n 1
ls (GNU coreutils) 8.30
$ man ls
...
       -b, --escape
              print C-style escapes for nongraphic characters
...
       -q, --hide-control-chars
              print ? instead of nongraphic characters

       --show-control-chars
              show  nongraphic characters as-is (the default, unless program is 'ls' and
              output is a terminal)

       -Q, --quote-name
              enclose entry names in double quotes

       --quoting-style=WORD
              use  quoting  style  WORD  for  entry  names:  literal,   locale,   shell,
              shell-always,  shell-escape,  shell-escape-always,  c,  escape  (overrides
              QUOTING_STYLE environment variable)
...

peterjc avatar Mar 20 '24 11:03 peterjc

I also think ls -1 should be an acceptable alternative to SC2012.

Note:

    didLs <- fmap or . sequence $ [
        for' ["ls", "grep"] $
            \x -> warn x 2010 "Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.",
        for' ["ls", "xargs"] $
            \x -> warn x 2011 "Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames."
        ]
    unless didLs $ void $
        for ["ls", "?"] $
            \(ls:_) -> unless (hasShortParameter 'N' (oversimplify ls)) $
                info (getId ls) 2012 "Use find instead of ls to better handle non-alphanumeric filenames."

hellwolf avatar Sep 02 '24 16:09 hellwolf

I wouldn't consider -b to be a sufficient safeguard, myself, no. My understanding is that in Linux any character except Ascii Nul can be used in a filename. In Posix with find -print0, Nul can be used as a consistent filename delimiter, so that it's possible to glean accurately what exactly any given filename might be. Varying inplementations of ls, sometimes even on the same machine (coreutils, toybox, etc.), as well as how those programs are used in a shell pipeline, can produce differing outputs.

$ ls -1 foo\005bar foo\nbar $ ls -1 | head -2 foobar foo $

On Mon, Sep 2, 2024, 9:54 AM Miao ZhiCheng @.***> wrote:

I also think ls -1 should be an acceptable alternative to SC2012.

— Reply to this email directly, view it on GitHub https://github.com/koalaman/shellcheck/issues/2950#issuecomment-2325100115, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUF2F2ZHPHJDOXNPOMFMEODZUSJ3BAVCNFSM6AAAAABE7IB22KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRVGEYDAMJRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

wileyhy avatar Sep 19 '24 11:09 wileyhy