ast
ast copied to clipboard
ASAN heap-use-after-free when running nameref.sh unit test
The nameref.sh unit test triggers one of the three remaining bugs found by ASAN. This is the minimum reproducer:
typeset +n c x
compound c=( typeset -a x )
nameref x=c.x
unset c
typeset +n nr
compound c
compound -A c.a
nameref nr=c.a[hello]
empty_fifos
exit 23
The final exit 23
is just so I know whether or not it reaches that statement. The ASAN failure occurs in the empty_fifos
function so the exit 23
shouldn't be executed. The failure appears due to the first nameref x
referring to an object that no longer exists. Here is the ASAN output:
==87849==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b000004c30 at pc 0x000105d8c90c bp 0x7ffeea022c90 sp 0x7ffeea022c88
READ of size 8 at 0x60b000004c30 thread T0
#0 0x105d8c90b in nv_create name.c:861
#1 0x105d82dfe in nv_open name.c:1268
#2 0x105bf0be9 in sh_readline read.c:468
#3 0x105bef9f7 in b_read read.c:401
#4 0x105e535f5 in sh_exec xec.c:1218
#5 0x105e5d7ba in sh_exec xec.c:1826
#6 0x105e5d621 in sh_exec xec.c:1810
#7 0x105e71e89 in sh_funscope xec.c:3174
#8 0x105e6b8fd in sh_funct xec.c:2765
#9 0x105e56a52 in sh_exec xec.c:1395
#10 0x105d71100 in exfile main.c:515
#11 0x105d750b1 in sh_main main.c:313
#12 0x105bca11b in main pmain.c:41
#13 0x7fff6a93c3d4 in start (libdyld.dylib:x86_64+0x163d4)
0x60b000004c30 is located 80 bytes inside of 100-byte region [0x60b000004be0,0x60b000004c44)
freed by thread T0 here:
#0 0x1062682fd in wrap_free (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x592fd)
#1 0x10600002b in ast_free vmbusy.c:50
#2 0x105d78ea5 in nv_delete name.c:1144
#3 0x105dc1f41 in outval nvtree.c:798
#4 0x105dc5515 in genvalue nvtree.c:971
#5 0x105db8fab in walk_tree nvtree.c:1138
#6 0x105dc66f7 in put_tree nvtree.c:1197
#7 0x105da6cae in nv_putv nvdisc.c:145
#8 0x105d87146 in _nv_unset name.c:2187
#9 0x105c0a6a7 in setall typeset.c:702
#10 0x105c119a1 in b_typeset typeset.c:500
#11 0x105e535f5 in sh_exec xec.c:1218
#12 0x105d71100 in exfile main.c:515
#13 0x105d750b1 in sh_main main.c:313
#14 0x105bca11b in main pmain.c:41
#15 0x7fff6a93c3d4 in start (libdyld.dylib:x86_64+0x163d4)
previously allocated by thread T0 here:
#0 0x106268687 in wrap_calloc (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x59687)
#1 0x105fffe43 in ast_calloc vmbusy.c:32
#2 0x105db027b in newnode nvdisc.c:730
#3 0x105db1bb9 in nv_search nvdisc.c:960
#4 0x105d8b54d in nv_create name.c:809
#5 0x105d82dfe in nv_open name.c:1268
#6 0x105c09c13 in setall typeset.c:654
#7 0x105c119a1 in b_typeset typeset.c:500
#8 0x105e535f5 in sh_exec xec.c:1218
#9 0x105d7efd8 in sh_setlist name.c:557
#10 0x105e4f80e in sh_exec xec.c:1043
#11 0x105d71100 in exfile main.c:515
#12 0x105d750b1 in sh_main main.c:313
#13 0x105bca11b in main pmain.c:41
#14 0x7fff6a93c3d4 in start (libdyld.dylib:x86_64+0x163d4)
The heap-use-after-free is due to the shp->oldnp->nvenv
portion of the final statement in this block of code:
https://github.com/att/ast/blob/609b031900918994a842712685e3ea0236104d7b/src/cmd/ksh93/sh/name.c#L855-L861
There are a couple of interesting points about this code. First, this is the only use of nv_refoldnp()
defined here:
https://github.com/att/ast/blob/609b031900918994a842712685e3ea0236104d7b/src/cmd/ksh93/include/name.h#L577
Notice that the code is testing whether any non-zero value has been assigned to np->nvalue
. It specifically tests whether a value has been assigned to the const_cp
member of that union but then proceeds to fetch the nrp
member (via the nv_refoldnp()
macro). The code is chock full of such sloppiness.
It's not entirely clear why the shp->oldnp
structure member exists. But it is clear that somewhere there is a missing, or incorrect, update of that value.
Note that the shp->oldnp
structure member does not exist in the ksh93u code. This was introduced after that release.
FWIW, I've instrumented the code and it appears the problem is that the unset c
statement is not invalidating the preceding nameref x=c.x
; that is, invalidating the x
nameref. So the question becomes: What does it mean to unset a var that has been captured by a nameref
(or equivalently a typeset -n
)? The documentation does not appear to say what should happen.
Adding a unset x
before the unset c
to first remove the nameref to c.x
"fixes" this problem.
P.S., the empty_fifos
function invocation isn't even necessary. It is simply an accident that it triggers the problem. With the DPRINTF()
, DPRINT_NVP()
and similar debug statements I've added to help figure out what is going on ASAN detects the problem before the empty_fifos
ksh function, and its embedded read
command, is executed.
Allow me to rephrase the question in my previous comment. The code clearly does not employ a reference count model for struct Namval
(i.e., Namval_t
) objects. Thus, any use of unset
on a var that has been "captured" by a nameref
(or typeset -n
) statement is dangerous. So what does it mean to allow unsetting a var which has an extant "nameref"`? Especially since that reference may be indirect.
Also, with the instrumentation in place that dumps info about the various Namval_t
and struct Namref
values the problem can be induced with this trimmed down ksh script:
compound c=( typeset -a x )
nameref x=c.x
unset c
compound c
compound -A c.a
nameref nr=c.a[hello]
And, as per my prior comment, simply inserting a unset x
before the unset c
"fixes" the problem.