shellcheck
shellcheck copied to clipboard
ShellCheck busy loops and allocates memory until it is killed by OOM killer
For bugs
- My shellcheck version (
shellcheck --version
or "online"): 0.9.0 - Originally reported in Red Hat Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2180035
Here's a snippet or screenshot that shows the problem:
A (gzipped) script that triggers the issue is attached: testsuite.gz
I was not able to isolate the problem to a small shell script. Perhaps the attached shell script is just too long/complex for shellcheck to process it?
Here's what shellcheck currently says:
shellcheck busy loops and allocates memory until it is killed by OOM killer.
Here's what I wanted or expected to see:
shellcheck should print static analysis results and exit successfully.
probably same as #2652 (guessed reason is in a comment).
This is undoubtedly due to ShellCheck's new data flow analysis engine. It takes great care to be acceptably fast even for larger scripts, but 22,000 lines is something else.
you have about 307 742 lines in that test file.
@brother Thanks for the pointer! Unfortunately, in our case the problem happens with shellcheck-0.7.2
, too.
same problem even if i close the nvim, and i have to forcibly power off
Same problem happened during run shellcheck for a configure script (7.6k, 20986 lines).
I have encountered the same issue with huge generated scripts made by autoconf.
I had a look at the attached testsuite.gz file. It turns out to be generated by autotest (part of the autotools suite, just as autoconf). Some searching indicates that this is based on https://github.com/firewalld/firewalld/blob/main/src/tests/testsuite.at, but that is not really relevant.
I bisected the file to try and understand what caused the issue. The large file itself was not a problem; if you cut the file in half, one half completes shellcheck in 0.1 seconds, while the other spins forever consuming more and more memory. (I hit ctrl-c after ~10 seconds as an indicator of whether or not the problem was present).
I managed to get it down to a single line of difference wether shellcheck hangs or not. It is an opening (
, which has no obvious matching closing )
(not one that I could easily find, anyway).
Here is the patch that turns testsuite
from a CPU/memory hogger to completing in 0.2 seconds on my computer:
--- testsuite 2024-01-31 14:15:15
+++ testsuite.working 2024-01-31 14:16:06
@@ -2326,7 +2326,6 @@
at_fn_group_banner 1 'firewall-cmd.at:5' \
"basic options" " " 1
at_xfail=no
-(
printf "%s\n" "1. $at_setup_line: testing $at_desc ..."
$at_traceon
It is not just the matching of (
itself that is problematic. I had a hunch that a long search for the closing parenthesis could trigger this bug, so I created this script to generate a huge test file enclosed within a (...)
pair. But it worked fine with shellcheck.
#!/bin/bash
SCRIPT=foo.sh
rm $SCRIPT
echo "#!/bin/bash" > $SCRIPT
echo "(" >> $SCRIPT
for i in {1..1000000} ; do
echo "echo Line $i" >>$SCRIPT
done
echo ")" >> $SCRIPT
As I said in #2652, this does not seem to be related to the extended analysis.
Furthermore, when closely looking at the memory consumption, I see that it increases not gradually, but in huge discreet steps. My guess is that it is a single array that is getting reallocated over and over as it grows without bound.
I have turned my eye toward the script that is problematic for me. (This is the shell script generated by autoconf for the OpenJDK project.) I have attached the script in question here: autoconf-reproducer.zip
When I run this on my M1 mac, it seemingly never ends. After about 1 minute of running, at which point it consumed > 6 GB RAM, I aborted it.
However, when a single line is deleted from the file, the analysis finished in 4-5 seconds (which is reasonable, considering that the file is ~ 170k lines), and I could not even measure the RAM consumption.
The patch that does this magic trick is here:
--- autoconf-BAD 2024-02-06 10:57:23
+++ autoconf-GOOD 2024-02-06 10:57:32
@@ -25504,7 +25504,6 @@
fi # end with or without slashes
# Now we have a usable command as new_path, with arguments in arguments
- if test "x$OPENJDK_BUILD_OS" = "xwindows"; then
if test "x$fixpath_prefix" = x; then
# Only mess around if fixpath_prefix was not given
Now I realize that this deletes an opening if
statements, and that will throw off all subsequent syntax. However, my point here is that it is not reasonable to expect shellcheck to handle one file perfectly fine, but fail miserably if a single line is added.
So the issue is not really the huge file of the script per se. I understand that a 170 k line script is massive, but without the problematic construct this works perfectly fine.
My guess is that there is some special construct in here that triggers a bad behavior in shellcheck. It seems like the complexity in both time and memory is growing with O(n^x) with the number of lines in the file when this construct is encountered.
Hence the large reproducer. A smaller reproducer would not show the bad complexity convincingly. I've tried my best to get it down to a single line difference. My hope is that someone more versed in shellcheck debugging can figure out exactly what problem this additional line provokes. I am a complete noob at Haskell; sorry. Otherwise I'd tried running the "bad" file for a while and then checked which array it is that is growing without bound. That'll probably give a decent clue to what the problem is. Maybe @koalaman can have a look?
I also tried this with the latest version, v0.9.0-99-gd80fdfa
, and running as shellcheck --extended-analysis=false autoconf-BAD
.
This gave a better result -- the memory consumption stayed below 3 GB, and the command actually finished after somewhat more than 2 minutes. (Just to confirm, I re-ran without --extended-analysis=false
and waited 3 minutes. It was still running by then, and had consumed 20 GB memory so I had to kill it.)
However, it is still a far cry from the patched file, which finishes in 5 seconds, both with and without --extended-analysis=false
.
So whatever the problem is, it is aggravated by the extended analysis, but it definitely exists even without it.
I am having a similar issue when editing my .bashrc
with nvim + mason + bash-language-server (which calls shellcheck for linting). After exploring a bit, I found it's because a reference that finally leading to the file /usr/share/nvm/nvm.sh
, which is only 4000 lines of code but takes 10 seconds and 2.6G memory to analyze.
Different from the result from @magicus https://github.com/koalaman/shellcheck/issues/2721#issuecomment-1929204173, setting --extended-analysis=false
helps with my issue (1.3 seconds and 130M memory). I'm not an expert of shell script, so failed pinning down the part causing the issue, even after bisecting the script.
A workaround is adding a line # shellcheck external-sources=false
before the line causing problem.
- OS: arch linux, with kernel
Linux 6.8.9-arch1-1 x86_64
- shellcheck version: 0.10.0
- script causing the issue: nvm.zip (from nvm 0.39.7)