woof-CE
woof-CE copied to clipboard
Coding style and PR etiquette.
coding style
There is plenty of messy shell code in woof. @wdlkmpx and myself (and others) over the years have attempted to clean it up somewhat but if someone commits code with a mix of tabs and spaces then we end up back at square :one:
basic guidelines
If a script has spaces ( 1 or 2 ) as the indent then follow that. If a script has tabs as the indent then follow that. Please don't mix them up.
code quality
EG:
if [ "`which program`" != "" ] ; then
exec program
fi
Let's analyze this
- we perform a conditional test
- we call an external program -
which
in a subshell - we compare it to
""
(essentially nothing) - the result is evaluated and we proceed with the block or skip the block depending on result.
So what could go wrong?
User or builder installed GNU which
giving vastly different output than busybox which
Let's try it another way
if type program >/dev/null 2>&1 ; then
exec program
fi
So what happens here?
- call the bash/ash/dash internal
type
to determine ifprogram
is installed and suppress the output - if
type
is successful the condition is satisfied and we proceed or skip as before
Can we see that this is faster? In a large script that has to test perhaps for several programs to determine which one to use then this will have a significant speed boost.
Imagine this on a pi Zero
browser=''
# choose first available browser
for B in firefox seamonkey chromium google-chrome opera vivaldi links lynx midori
do
# [ "`which $B`" != "" ] && browser=$B && break # too slow
type $B >/dev/null 2>&1 && browser=$B && break
done
Basically, bash
and ash
are very powerful and can speed up your code if used diligently. Of course sometimes we can't get away from using external calls, especially sed
, grep
and awk
. I myself am no expert in shell internals and I often refer to ash or bash man pages; there are many to be found online. Redirections can also be very powerful as well as the operators. Take time to familiarise yourselves more with bash and ash and you will be pleased with the results. Happy coding :smile:
pull requests
If you have a pull request that requires that new programs need to be installed in all PKGS_SPECS_TABLE
s variables (DISTRO_PKS_SPECS-$DISTRO_BINARY_COMPAT-$DISTRO_COMPAT_VERSION
file), then please make that one pr rather than a separate pr for each version.
Please make a better comment about the PR. You have to sell the idea to the maintainer to convince her/him to test it and then merge it. Doesn't matter if it's a 200 word or more essay! 5 or six words is not enough to convince me. Remember, these go into the git history so it is much easier to find certain commits if something goes wrong or the the commit needs enhancement. 20 to 30 words is usually enough, just explaining why it is needed and what it affects.
Also, if it is relevant, include in the commit message the relevant related commits (just the first seven chars of the sha commit number) or the issue number, especially if it closes the issue eg:
fix such and such bug, closes #9876
this fixes the bug where the computer jumped out the window and ran off with my motorbike. necessary because I like my motorbike
Don't let this discourage you from sending in PRs! Don't be discouraged if it gets rejected either, there will be a reason.
My preference is for two spaces rather than a tab the default spacing for most text editors is that "1 tab" equals "four spaces" and unless you have a giant screen (or high resolution and good eyes), it doesn't take much nesting to use you your screen width.
I do see though that geany has some automatic settings, that will make this easier for me, to be hopefully be consistent with others. Under the document menu, I can set the following: Indent Type -> Detect from document #This should address the above: Indent With -> "2" # This should make me happy.
Or if a document uses spaces rather than tabs are we particular about how many spaces to use?
Regarding "which" vs "type", I think THAT "which" is more readable because it is a command that most people use. That said I see the merit in writing portable code. My question is why would the GNU version of which behave differently than the busybox version. Shouldn't errors be printed to standard error and not captured by quotation marks.
The non obvious thing about which, is that it won't find a program in the path if it isn't executable (e.g. a broken symlink).
if you prefer to use which
suppression of all output would be preferable; just replace type
with which
in my example. Would be faster than the test/comparison, as we just rely on exit code.
That said, quality of code is important and understanding of the shell you are using certainly can increase the speed and quality. Don't worry, some of my code isn't that great either but I have been making a concerted effort to improve it.
if you prefer to use
which
suppression of all output would be preferable; just replacetype
withwhich
in my example. Would be faster than the test/comparison, as we just rely on exit code.That said, quality of code is important and understanding of the shell you are using certainly can increase the speed and quality. Don't worry, some of my code isn't that great either but I have been making a concerted effort to improve it.
Good point, regarding the speed difference of comparison vs exit codes.
Yeah I'm a terrible offender for this - calling sub shells, and checking output when return codes will suffice..
But I always use 2 spaces for indentation, as it works best on various setups .. (tabs looks huge on github/web).
...I will go through Pkg at some point and do a PR which is soley focused on reducing sub shells and using builtins as much as possible to speed it up.
An example of using Bash builtins only
There is a great project called fff (f***ing fast file manager) which is a great example of using Bash 3.2 builtins only. https://github.com/dylanaraps/fff
Fixing indentation
I've never heard of it till just now, but there is apparently the expand
command, which can be used to fix indentation. From https://stackoverflow.com/a/11094620/5479837 :
find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {}
Another example from the same page, this time using the sponge
command from moreutils:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;
Explanation:
./
is recursively searching from current directory-iname
is a case insensitive match (for both *.java and .JAVA likes)type -f
finds only regular files (no directories, binaries or symlinks)-exec bash -c
execute following commands in a subshell for each file name, {}expand -t 4
expands all TABs to 4 spacessponge
- soak up standard input (from expand) and write to a file (the same one).
Geany settings
Maybe we could update the default settings in Geany, so that all Pups have Geany setup to use two spaces by default, rather than to respect files existing indentation...?
Just a thought.. That is how I set it up on mine (for working on my own stuff).
Other editors
Sublime-Text 3
This is an excellent text editor, available in most repos. Can easily fix indentation using the editor, or various indentation plugins.
Micro
FYI, I use micro
(https://github.com/zyedidia/micro/) a terminal based editor, with modern stuff like linters, formatters, multiple select, modern (VS code or sublime-like) key bindings, etc. I rarely use Geany these days.
Micro can be setup to use prettier
and a bunch of other formatters that properly formats your code on save... Or you can manually set these things in ~/.config/micro/settings.json
, on a per-filetype basis.
If we do need sub-shells
Should we always be using $()
instead of back ticks?
The major benefit is the ability to nest
$()
commands within commands, without losing your sanity trying to figure out if some form of escaping will work on the backticks.
$()
is POSIX and supported by all modern Bourne shells, e.g. ksh, bash, ash, dash, zsh, busybox, you name it
Here's something I didn't know about the difference:
Also, a note about using
$()
and backticks in aliases. If you have aliasfoo=$(command)
in your.bashrc
then command will be executed when the alias command itself is run during.bashrc
interpretation. With alias foo=command
, command will be executed each time the alias is run. But if you escape the$
with the$()
form (e.g.alias foo=\$(command)
), it too will execute each time the alias is run, instead of during.bashrc
interpretation. As far as I can tell by testing, anyway; I can't find anything in the bash docs which explain this behavior.
Redirecting or suppressing output
What is preferred?
foo >/dev/null 2>&1
# or
foo &>/dev/null
Because I personally like doing it this way, as I find it's the only syntax I ever remember:
# supress all output
foo &>/dev/null
# suppress stdout
foo 1>/dev/null
# suppress errors
foo 2>/dev/null
Any reasons not to use this simpler syntax? It works in Bash
and Busybox Ash
.
Preferred shells
Should we prefer ash
to bash
?
I would say yes.
Busybox ash
has long had lots of "Bashisms" available, like ${foo/bar/baz}
, etc.
I've used Bash arrays a lot inmdsh
, and tbh, I'm not that keen...
Overall, I don't see much benefit to using Bash over Ash, except that the read
builtin in Bash has more options (-i
.. see ppa2pup
script in Pkg.. IIRC).
Hashbangs
Should we define which shell we use explicitly at the top of our scripts?
#!/bin/bash
and
#!/bin/ash
...rather than
#!/bin/sh
?
Suggestions, if I may.
- Should we prefer ash to bash?
See response to next question.
- Should we define which shell we use explicitly at the top of our scripts?
Use #!/bin/sh if you are absolutely sure that your script can run in any kind of bourne-compatible shell. This include bash, ash, dash, zsh, and a few others.
If your script uses bashim, then use #!/bin/bash explicitly.
In Fatdog we uses #!/bin/dash for most scripts (dash is statically compiled) to ensure that it is small and fast to launch. "dash" is the closest you can get to the original bourne shell, so if it runs in dash it would (almost) be sure to run in other shells too. However, it doesn't have a lot of features; we have to resort to calling "sed" or "awk" to do anything other than the simple things. This may defeat the purpose because calling an external program is always slower than executing shell builtins, so depending on what your script is doing, "bash" may be better than "dash" (and indeed, we have a few scripts which explicitly specify #!/bin/bash for this reason).
If you need something beyond what "dash" can do but does not need the entire "bash" support and still want to be reasonably fast, "ash" is a good choice. But beware. This is not the case in Puppy, but "busybox ash" can actually be compiled without support for bashism - so if you want to use "#!/bin/ash" that uses bashim, then make an explicit note about it so someone else who reads your script knows what to expect.
- Redirecting or suppressing output
foo >/dev/null 2>&1
foo 1>/dev/null
foo 2>/dev/null
is standard. The rest is bashism. Use anything you like, but see the note about hashbang above.
- Should we always be using $() instead of back ticks?
Major benefits of $() are: a. It's easier on the eyes and less likely to be mis-read b. It can be nested.
- Other editors.
No comment needed. Preferred editors are purely subjective taste.
- Geany settings.
No comment needed. You all just need to decide/agree.
- which/type
"type" is a shell built-in which is supported in almost all known shells (dash,ash,bash) and is definitely faster. If you need to find several programs the difference will stack up. In my earlier scripting I always used "which" but after I discovered "type" I abandoned it almost completely.
- Indentation.
I personally uses 4-space tab indentation. It makes code oh so much easier to read and follow. One-space indentation is the same as no indentation. I see many other (non-Puppy) codes that indents at two-spaces. I don't like it, but that's just me.
Using Shellcheck to improve your shell scripts
# shellcheck
Usage: shellcheck [OPTIONS...] FILES...
-e CODE1,CODE2.. --exclude=CODE1,CODE2.. exclude types of warnings
-f FORMAT --format=FORMAT output format
-C[WHEN] --color[=WHEN] Use color (auto, always, never)
-s SHELLNAME --shell=SHELLNAME Specify dialect (sh,bash,dash,ksh)
-x --external-sources Allow 'source' outside of FILES.
-V --version Print version information
Examples:
- Check a shell script:
shellcheck file.sh
- Override script's shebang:
shellcheck --shell sh|bash|ksh file.sh
- Ignore certain errors:
shellcheck --exclude SC1009 file.sh
- Ignore multiple errors:
shellcheck --exclude SC1009,SC1073 file.sh
Example of shellcheck /usr/sbin/pkg
:
In /usr/sbin/pkg line 27:
APPTITLE="$APPNAME $APPVER"
^-- SC2034: APPTITLE appears unused. Verify it or export it.
In /usr/sbin/pkg line 33:
SELF=$(basename $0) # this script
^-- SC2086: Double quote to prevent globbing and word splitting.
In /usr/sbin/pkg line 59:
[ -z "$PKG_PPA2PUP_FN" ] && if [ ! -z "`which gawk`" ]; then PKG_PPA2PUP_FN=ppa2pup_gawk; else PKG_PPA2PUP_FN=ppa2pup; fi
^-- SC2006: Use $(..) instead of legacy `..`.
In /usr/sbin/pkg line 77:
[ -f /tmp/pkg/error130 -a "$2" = '130' ] && exit "$2"
^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
...
In /usr/sbin/pkg line 294:
[ -f $TMPDIR/func_list ] && cat $TMPDIR/func_list | sort && exit 0
^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.
...
In /usr/sbin/pkg line 368:
pkg_version=`echo $PKGNAME | sed -e "s/^${PKGNAME_ONLY}_//g"`
^-- SC2001: See if you can use ${variable//search/replace} instead.
...
In /usr/sbin/pkg line 508:
pkg_name_only="`echo "$(basename "${1}")" | sed -e 's/-[^-][^+][^a-zA-Z][0-9.]*/@/'`"
^-- SC2001: See if you can use ${variable//search/replace} instead.
^-- SC2005: Useless echo? Instead of 'echo $(cmd)', just use 'cmd'.
...
In /usr/sbin/pkg line 2045:
[ "$1" ] && get_repo_info "$1" || . ${PKGRC}
^-- SC2015: Note that A && B || C is not if-then-else. C may run when A is true.
...
In /usr/sbin/pkg line 3199:
cd -
^-- SC2164: Use cd ... || exit in case cd fails.
^-- SC2103: Use a ( subshell ) to avoid having to cd back.
...
In /usr/sbin/pkg line 3280:
if [ "`find /tmp/* -iname $1*.t*`" != '' ];then
^-- SC2061: Quote the parameter to -iname so the shell won't interpret it.
In /usr/sbin/pkg line 3351:
build_number="-$(($build_number + 1))"
^-- SC2004: $/${} is unnecessary on arithmetic variables.
In /usr/sbin/pkg line 3370:
rm -rf "${CURDIR}/${PKGNAME}/" 2>/dev/null
^-- SC2115: Use "${var:?}" to ensure this never expands to / .
...
In /usr/sbin/pkg line 4793:
[ -z $HOME/.packages/${PKGNAME}.files ];then
^-- SC2157: Argument to -z is always false due to literal strings.
You can go through the script, excluding more and more SCnnnnn
stuff, and tackle them bit by bit... Lots of work to do in Pkg
:/
Checking for specific "code smells" or problems
Each individual ShellCheck issue has its own wiki page, like these: SC1000, SC1001, ..., SC2236.
GitHub Wiki does not support enumerating them on a page, so to find the one you're interested in, you can do any of the following:
- Use the "> Pages" menu on the right (desktop) or below (mobile) and search for the error code, e.g. SC2086.
- Visit an arbitrary issue and edit the URL in your browser.
- Trigger the issue on shellcheck.net and click/tap the issue id.
Example pages:
-
SC2002: Remove useless call to
cat
- List of all checks: https://github.com/koalaman/shellcheck/wiki/Checks
Customising what shellcheck
looks at
See https://github.com/koalaman/shellcheck/wiki/Directive
Ignoring one specific line in a file:
Use a directive to disable a certain line:
hexToAscii() {
# shellcheck disable=SC2059
printf "\x$1"
}
Some nice comments :smiley:
As for indentation, my main point was that if you edit a script in a commit keep the indentation the same as the original.
I personally dislike 1 space indentation for the same reason James stated above. In init
in Puppy's initrd.gz
1 space is used and in large nested if
statements it can be a challenge finding the end of one, especially on a small screen or in a console editor. (which reminds me I have to reinstate busybox vi
at some point).
After this thread has been going for a while I'll close the issue and glean the best ideas for inclusion for a readme
or even make a wiki entry.
Keep 'em coming!
Maybe this readme
could include stuff about using standardised or consistent directories and locations for scripts, config files, etc.
Managing $HOME
It might be nice to have in a readme
(developer handbook?) info about managing $HOME
properly:
- not hard-coding values like
/root/
in scripts or config files - where to put config files:
-
$HOME/.program-namerc
-
$HOME/.config/<program-name>/file.config
- any other/better places?
-
System wide settings vs users settings
And also suggest putting config files in /etc/<program-name>/
, until after first launch of program-name
, so they get copied to $HOME, no matter where it is, or which user runs the program.
Managing shared assets
Maybe we can also encourage users to put icons, shared scripts, config files, in /usr/share/<program-name>/
, rather than /usr/lib/
.. Any better places?
Using /tmp
and /var
dirs properly
- about persistence (or not)
- what to put in /tmp
- what to put in /var
- how to use
/tmp
:- get PID of your script or program before writing to
/tmp
- write to
/tmp/<program-name>/<pid>/file
, if possible, not/tmp/file
- don't change its permissions
- get PID of your script or program before writing to
- how to use
/var/
- ...
Other system directories
Maybe some info like:
Your programs should not normally write or change files in these directories:
-
/bin
- used for system programs and binaries -
/lib
- used for system libraries - ...
Commit messages in this project are horrible! Many are Update x
or Create x
(maybe because people upload and edit files straight through GitHub instead of using an IDE or git commit
). They're not informative and sometimes a single commit changes multiple, unrelated things. And in other cases, identical or related changes to multiple files (like mass find/replace on .desktop files) are one commit per file, and that's messy too.
(And IMHO coding style, etc' should go to a CONTRIBUTING.md
file or something that's part of the repo.)
(And IMHO coding style, etc' should go to a
CONTRIBUTING.md
file or something that's part of the repo.)
There's an idea.