quark-engine icon indicating copy to clipboard operation
quark-engine copied to clipboard

Rule gets 60%, I think it should get 80% or 100%. Not sure.

Open cryptax opened this issue 4 years ago • 7 comments
trafficstars

Students and I are working on unpacked malicious sample TI.json, sha256: b87b564401214ffe26592ada3f619126e2d83e2fc9d2fe55e421c8fd6ab2bf2f. This is an Android/Flubot, unpacked. The DEX can be downloaded here: It is encrypted with GPG, using password infected.

I am trying to write a rule that detects that SMS content or number is sent over HTTP. This is the rule:

{
    "crime": "Get SMS address and send it through http",
    "permission": [],
    "api": [
        {
            "descriptor": "()Ljava/lang/String;",
            "class": "Landroid/telephony/SmsMessage;",
            "method": "getDisplayOriginatingAddress"
        },
        {
            "descriptor": "()Ljava/io/OutputStream;",
            "class": "Ljava/net/HttpURLConnection;",
            "method": "getOutputStream"
        }
    ],
    "score": 1,
    "label": [
        "sms",
        "http"
    ]
}

It only achieves 60%.

I am a bit surprised, but perhaps it is normal, though I'd prefer you check with me.

In SmsReceiver.onReceive(), we have this code where the originating phone number of the SMS is retrieved.

SmsMessage v2 = SmsMessage.createFromPdu(((byte[])v10_1[v1]));
String incoming_sms_str = Deobfuscator.app.Release.getString(-8990920220163560112L) + v2.getDisplayOriginatingAddress().toString() + Deobfuscator.app.Release.getString(0x8339D5A919462D50L) + v2.getMessageBody().toString();

It is placed in a string incoming_sms_str, and then sent to the C&C's panel via PanelReq.SendAsync:

if(Bot.IsIntSms()) {  // intercept SMS?
PanelReq.SendAsync(incoming_sms_str, Boolean.valueOf(true));

What SendAsync does it create a Thread (is that the issue why we only get 60%?). and then call Send, which forwards to another Send. There, the value gets encrypted (PanelReq.Encrypt) and Base64 encoded. It is stored as POST content (SetPostContent).

  byte[] inout = plaintext.getBytes(StandardCharsets.UTF_8);
            PanelReq.Encrypt(inout, key, true);
            String base64encoded_xor = Base64.encodeToString(inout, 2);
            httpcom.SetPostContent(String.format(Deobfuscator.app.Release.getString(0x8339DB5B19462D50L), encrypted, base64encoded_xor));
            httpcom.SetPost(true);
if(!httpcom.Submit()) {

Finally, the HTTP request is submitted. This actually creates a DataOutputStream and writes the stored post content (this.postContent) into it.

DataOutputStream v3 = new DataOutputStream(v2.getOutputStream());
v3.write(this.postContent, 0, this.postContent.length);

So, for me, getDisplayOriginatingAddress and getOutputStream happen sequentially and have the same "father" which is SmsReceive.onReceive(). The only possible issue being the Thread in between. If the thread is not issue, I think we should get actually 100%. Shouldn't we?

Is this the normal behaviour? If so, why only 60%? Is it improvable?

I am using Quark v21.4.3. Thanks !

cryptax avatar Apr 09 '21 07:04 cryptax

Hi @cryptax ,

Thanks for reporting this issue! I will working on it now and I will keep you updated if there is anything new coming up.

krnick avatar Apr 13 '21 05:04 krnick

Hello @cryptax ,

After checking for a while, I analyzed the two Android Native APIs with the malware you provided, and the result confidence 60% is correct. The following is my analysis:

To achieve 80% confidence, there must be a mutual parent function, or you could say the same super function.

In the first Android API getDisplayOriginatingAddress, I searched for its super function recursively. Found that there is only one function called SmsReceiver.onReceive().

That means that we must find the same function in the second Android API getOutputStream, which is SmsReceiver.onReceive() too.

Then I search recursively again below, and did not find SmsReceiver.onReceive(), which means that confidence should only match 60%.

(I only show the method name)

When solving this issue, I found that using such a tree graph can help malware analysts trace code more quickly. I think I will release this feature soon. Thank you for inspiring me this idea.

Hope this is helpful to you!

krnick avatar Apr 20 '21 06:04 krnick

@krnick thanks for looking into it. Using call graphs like this is unfortunately going to miss cases where an interface is used, which is the case with Threads/Runnable.

In your graph, the call to onReceive is in the first run (PanelReq$1.run()): screenshot

Then, the issue is that this run() is inside a Runnable, itself inside a Thread. The "caller" of run(), is Thread.start(). If you manage to reflect this, then, we are able to go up to onReceive() : Thread.start() is called by SendAsync() -> called by onReceive().

cryptax avatar Apr 22 '21 08:04 cryptax

To enhance Quark at finding better "who calls who", perhaps you could add a specific case for the run() methods of Thread or Runnable. In that case, the "caller" is start().

This is typically what I manually/intuitively do:

  1. Check who calls run().
$ grep -r "PanelReq$1;->run" ./com/example/myapplicationtest
$

We see nobody, which normally means "dead end".

  1. Check if the class implements an interface.
$ grep implements com/example/myapplicationtest/PanelReq\$1.smali 
.implements Ljava/lang/Runnable;

It does implement a Runnable interface, and I know that Runnable.run() is precisely one of the methods of the interface, so our PanelReq$1.run() is actually overriding it. Note that JEB manages to find which method overrides which.

  1. We specifically know for Runnable and Thread that the call comes from Runnable.start or Thread.start. So, we need to search who is doing a start on a PanelReq$1 object.
$ grep -r "Runnable;->start" ./com/example/myapplicationtest/
$ grep -r "Thread;->start" ./com/example/myapplicationtest/
./com/example/myapplicationtest/Spammer.smali:    invoke-virtual {p0}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/SocksClient.smali:    invoke-virtual {v2}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/SocksClient.smali:    invoke-virtual {v2}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/SocksClient.smali:    invoke-virtual {p1}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/PanelReq.smali:    invoke-virtual {v0}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/Bot.smali:    invoke-virtual {v0}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/Bot.smali:    invoke-virtual {p1}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/Bot.smali:    invoke-virtual {p0}, Ljava/lang/Thread;->start()V
./com/example/myapplicationtest/Bot.smali:    invoke-virtual {v0}, Ljava/lang/Thread;->start()V

The interesting one is in PanelReq.smali:

new-instance v1, Lcom/example/myapplicationtest/PanelReq$1;
...
invoke-direct {v0, v1}, Ljava/lang/Thread;-><init>(Ljava/lang/Runnable;)V
invoke-virtual {v0}, Ljava/lang/Thread;->start()V

This is inside .method public static SendAsync(Ljava/lang/String;Ljava/lang/Boolean;)V. So, now we continue searching who calls SendAsync.

Hmm. Okay. It's easy for a human mind, but may be more complex to implement :(

cryptax avatar Apr 22 '21 09:04 cryptax

For now, Quark does not support detecting the Thread, so the graph seems not to find out the usage of the case with Threads/Runnable.

Thank you @cryptax for providing me with your valuable experience. I will try to see if I can implement this feature, even if it is difficult :).

krnick avatar Apr 24 '21 03:04 krnick

@krnick for this one we had a trick that might help: in the call tree analysis, replace any Thread.start() by Thread.run(), and this should automatically fix the call tree :) Of course, the code does start() and not run(), so this would be just a replacement for the analysis.

cryptax avatar Apr 26 '21 07:04 cryptax

Hi all,

we (@cryptax , @ciastron , @Dil3mm3 and me) tried to dig deeper into this issue. First we tried to understand how Quark works analyzing the step 4 of Quark analysis (sequence check). In particular we looked at class Quark inside Quark.py, methods: find_previous_method, find_intersection , check_sequence.

Firstly, we noticed that the constant MAX_SEARCH_LAYER is too small to get all the sequence of function calls, so we incremented it to 6. Then, we were able to get all the parent methods of getOutputStream (second API of the rule mentioned above), until the node Lcom/example/myapplicationtest/PanelReq$1;->run()V (as shown in the graph you posted above). Here we encountered another problem: Quark cannot detect the parent function of that run because it is not present in the code. In fact inside the smali instead of the run there is the following call: Ljava/lang/Thread;->start()V. Even if we substitute in the apk Ljava/lang/Thread;->start()V with Ljava/lang/Thread;->run()V as suggested above by @cryptax , it doesn't work because the signature of the API run is different from the one got before. Finally, if we inserted manually in the apk the call Lcom/example/myapplicationtest/PanelReq$1;->run()V instead of Ljava/lang/Thread;->start()V in method SendAsync() , we can get onReceive() as parent function of the two APIs of the rule.

Since the problem seems to be so deep and related to a such specific situation, we think that a possible way to solve it could be to act directly on Quark source code and handle these particular situations in a "custom" way (maybe intercepting the method run and substituting its custom class with the generic Thread class). Unfortunately we don't know how to solve it, but we hope this analysis could be useful for you.

3aglew0 avatar Jun 03 '21 10:06 3aglew0