ast icon indicating copy to clipboard operation
ast copied to clipboard

ASAN heap-use-after-free when running nameref.sh unit test

Open krader1961 opened this issue 5 years ago • 6 comments

The nameref.sh unit test triggers one of the three remaining bugs found by ASAN. This is the minimum reproducer:

typeset +n c x
compound c=( typeset -a x )
nameref x=c.x

unset c
typeset +n nr
compound c
compound -A c.a
nameref nr=c.a[hello]

empty_fifos
exit 23

The final exit 23 is just so I know whether or not it reaches that statement. The ASAN failure occurs in the empty_fifos function so the exit 23 shouldn't be executed. The failure appears due to the first nameref x referring to an object that no longer exists. Here is the ASAN output:

==87849==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b000004c30 at pc 0x000105d8c90c bp 0x7ffeea022c90 sp 0x7ffeea022c88
READ of size 8 at 0x60b000004c30 thread T0
    #0 0x105d8c90b in nv_create name.c:861
    #1 0x105d82dfe in nv_open name.c:1268
    #2 0x105bf0be9 in sh_readline read.c:468
    #3 0x105bef9f7 in b_read read.c:401
    #4 0x105e535f5 in sh_exec xec.c:1218
    #5 0x105e5d7ba in sh_exec xec.c:1826
    #6 0x105e5d621 in sh_exec xec.c:1810
    #7 0x105e71e89 in sh_funscope xec.c:3174
    #8 0x105e6b8fd in sh_funct xec.c:2765
    #9 0x105e56a52 in sh_exec xec.c:1395
    #10 0x105d71100 in exfile main.c:515
    #11 0x105d750b1 in sh_main main.c:313
    #12 0x105bca11b in main pmain.c:41
    #13 0x7fff6a93c3d4 in start (libdyld.dylib:x86_64+0x163d4)

0x60b000004c30 is located 80 bytes inside of 100-byte region [0x60b000004be0,0x60b000004c44)
freed by thread T0 here:
    #0 0x1062682fd in wrap_free (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x592fd)
    #1 0x10600002b in ast_free vmbusy.c:50
    #2 0x105d78ea5 in nv_delete name.c:1144
    #3 0x105dc1f41 in outval nvtree.c:798
    #4 0x105dc5515 in genvalue nvtree.c:971
    #5 0x105db8fab in walk_tree nvtree.c:1138
    #6 0x105dc66f7 in put_tree nvtree.c:1197
    #7 0x105da6cae in nv_putv nvdisc.c:145
    #8 0x105d87146 in _nv_unset name.c:2187
    #9 0x105c0a6a7 in setall typeset.c:702
    #10 0x105c119a1 in b_typeset typeset.c:500
    #11 0x105e535f5 in sh_exec xec.c:1218
    #12 0x105d71100 in exfile main.c:515
    #13 0x105d750b1 in sh_main main.c:313
    #14 0x105bca11b in main pmain.c:41
    #15 0x7fff6a93c3d4 in start (libdyld.dylib:x86_64+0x163d4)

previously allocated by thread T0 here:
    #0 0x106268687 in wrap_calloc (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x59687)
    #1 0x105fffe43 in ast_calloc vmbusy.c:32
    #2 0x105db027b in newnode nvdisc.c:730
    #3 0x105db1bb9 in nv_search nvdisc.c:960
    #4 0x105d8b54d in nv_create name.c:809
    #5 0x105d82dfe in nv_open name.c:1268
    #6 0x105c09c13 in setall typeset.c:654
    #7 0x105c119a1 in b_typeset typeset.c:500
    #8 0x105e535f5 in sh_exec xec.c:1218
    #9 0x105d7efd8 in sh_setlist name.c:557
    #10 0x105e4f80e in sh_exec xec.c:1043
    #11 0x105d71100 in exfile main.c:515
    #12 0x105d750b1 in sh_main main.c:313
    #13 0x105bca11b in main pmain.c:41
    #14 0x7fff6a93c3d4 in start (libdyld.dylib:x86_64+0x163d4)

krader1961 avatar Apr 03 '19 02:04 krader1961

The heap-use-after-free is due to the shp->oldnp->nvenv portion of the final statement in this block of code:

https://github.com/att/ast/blob/609b031900918994a842712685e3ea0236104d7b/src/cmd/ksh93/sh/name.c#L855-L861

There are a couple of interesting points about this code. First, this is the only use of nv_refoldnp() defined here:

https://github.com/att/ast/blob/609b031900918994a842712685e3ea0236104d7b/src/cmd/ksh93/include/name.h#L577

Notice that the code is testing whether any non-zero value has been assigned to np->nvalue. It specifically tests whether a value has been assigned to the const_cp member of that union but then proceeds to fetch the nrp member (via the nv_refoldnp() macro). The code is chock full of such sloppiness.

It's not entirely clear why the shp->oldnp structure member exists. But it is clear that somewhere there is a missing, or incorrect, update of that value.

krader1961 avatar Apr 03 '19 03:04 krader1961

Note that the shp->oldnp structure member does not exist in the ksh93u code. This was introduced after that release.

krader1961 avatar Apr 03 '19 04:04 krader1961

FWIW, I've instrumented the code and it appears the problem is that the unset c statement is not invalidating the preceding nameref x=c.x; that is, invalidating the x nameref. So the question becomes: What does it mean to unset a var that has been captured by a nameref (or equivalently a typeset -n)? The documentation does not appear to say what should happen.

Adding a unset x before the unset c to first remove the nameref to c.x "fixes" this problem.

P.S., the empty_fifos function invocation isn't even necessary. It is simply an accident that it triggers the problem. With the DPRINTF(), DPRINT_NVP() and similar debug statements I've added to help figure out what is going on ASAN detects the problem before the empty_fifos ksh function, and its embedded read command, is executed.

krader1961 avatar Apr 04 '19 05:04 krader1961

Allow me to rephrase the question in my previous comment. The code clearly does not employ a reference count model for struct Namval (i.e., Namval_t) objects. Thus, any use of unset on a var that has been "captured" by a nameref (or typeset -n) statement is dangerous. So what does it mean to allow unsetting a var which has an extant "nameref"`? Especially since that reference may be indirect.

krader1961 avatar Apr 04 '19 05:04 krader1961

Also, with the instrumentation in place that dumps info about the various Namval_t and struct Namref values the problem can be induced with this trimmed down ksh script:

compound c=( typeset -a x )
nameref x=c.x

unset c
compound c
compound -A c.a
nameref nr=c.a[hello]

krader1961 avatar Apr 04 '19 05:04 krader1961

And, as per my prior comment, simply inserting a unset x before the unset c "fixes" the problem.

krader1961 avatar Apr 04 '19 05:04 krader1961