yara icon indicating copy to clipboard operation
yara copied to clipboard

Feature: 'hex' keyword

Open Neo23x0 opened this issue 6 years ago • 10 comments

A new keyword hex that encodes a string could improve the rule writing process to face the rise of malicious embedded scripts in OLE objects.

Instead of the string

$s1 = "68007400740070003a002f002f00"

a user could write

$s1 = "http://" wide hex

and other users could understand the rule without a comment.

/* http:// - wide and hex encoded */
$s1 = "68007400740070003a002f002f00"

Update 14.11.17: corrected the request - see the comment below

Neo23x0 avatar Sep 24 '17 08:09 Neo23x0

But that's exactly what the wide modifier does, right? I mean, these two strings are equivalent:

$s1 = "http://" wide

$s1 = { 68 00 74 00 74 00 70 00 3a 00 2f 00 2f 00 }

plusvic avatar Sep 24 '17 09:09 plusvic

Ah - no. let me explain it differently. The objects are embedded in a hex encoded form. (see the screenshot)

screen shot 2017-09-25 at 16 47 21

If you hex decode the strings, you'll get the VBA code in wide formatting. (not sure if this is always the case but I haven't seen ASCII code)

screen shot 2017-09-25 at 16 51 15

So, it would be easier for a user to to write

$vba1 = "10.24.113.102/office.png" hex wide

Instead of

$vbacode1 = "00310030002E00320034002E00310033002E003100300032002F006F00660066006900630065002E0070006E0067" 

(note that this string and no byte chain)

I propose this feature because

  1. I have seen a lot of malicious scripts and obfuscation stuff for which I wrote rules with strings like $vbacode1recently and the number of malicious embedded content is rising
  2. I though that it wouldn't be that difficult to implement

Possible problem:

  • The nocase keyword and its application

Neo23x0 avatar Sep 25 '17 15:09 Neo23x0

I see that I made an error in the initial request. I used the { instead of " symbols. My fault.

Neo23x0 avatar Sep 25 '17 15:09 Neo23x0

I realize this is an old ticket, but I'd like to just mention that I too would find the hex modifier incredibly helpful. Like Florian, my use-case is also related to OLE objects. Specifically, I was working with some RTFs recently where the attackers used a single-byte XOR key to obfuscate an embedded payload. This XOR key of course has the ability to differ from sample to sample. In an ideal world I would write something like the following:

$uniq_string = "GET /malware.png HTTP/1.1" hex xor

This would a) iterate through all xor permutations of the string and b) convert the string to it's hex-encoded value. Unfortunately, the only solution at this time that I'm aware of to tackle this problem is to use a programming language to generate all permutations of this string in its hex-encoded representation, and do a 1 of ($uniq_string*) on all of those permutations. This of course makes for an incredibly large YARA rule when there are multiple strings you're checking for. Such a rule often cannot be ingested by certain systems due to the size.

jgrunzweig avatar Oct 16 '19 18:10 jgrunzweig

Hello!

I'd like to add my support for adding a hex modifier. Here are some possible details to consider:

  1. Hex should be applied after base64/xor modifier, but before ascii/wide modifier.
  2. Hex should be nocase by default, since most of the time code/things that ingest hex characters are case-insensitive.

Here's an example of two rules that might be equivalent with the new hex keyword:

rule example
{
strings:
    $a = "foobar" hex  ascii wide 
condition:
    any of them
}

rule equivalent_example
{
strings:
    $a = "666f6f626172" nocase ascii wide 
condition:
    any of them
}

Whilst it is possible to add these rules at present, they would be more readable if there were a "hex" keyword.

Cheers, Tom

tlansec avatar May 06 '20 10:05 tlansec

See #1249 for more context on adding additional modifiers.

It appears that new modifiers will not be added at this point, and something like composable modifiers are needed before additional modifiers are added.

malvidin avatar May 06 '20 11:05 malvidin

I don't know if/when this new modifier is going to be considered but I'd think about changing it to "hexstring" or something similar. Otherwise users could be confused as an "hex" modifier may suggest it matches both "foobar" and { 66 6f 6f 62 61 72 } (the hex equivalent of its chars). 😉

merces avatar May 08 '20 18:05 merces

Signing up for this. My use case revolves around finding payloads in Msoffice documents, which are a common vector of sending malware through email.

As VBA (the language used in macro programming for that software) is case-insensitive, dealing with hex strings is a pain, as permutations on capitalization are quite high and some implementations for yara rules are quite limited.

Something like:

$string = "a fragment of FBA code" nocase hex

Would be very much appreciated for dealing with these kinds of issues, although I don't know if that would be possible.

hluaces avatar Sep 03 '20 12:09 hluaces

This feature would be helpful.

strictlymike avatar Nov 10 '21 00:11 strictlymike

As discussed in https://gist.github.com/wxsBSD/44aa8b8133e3ea96e738b66ec1c600f2, the problem with adding more modifiers is defining how they interact with other modifiers. For example, if you say "foo" nocase hex what does it means exactly? Does it means that it should match the uppercase (or lowercase) hex representation of "foo", "Foo", "FOO", and all the other case combinations of "foo"? Or does mean that it should match the hex representation of "foo" no matter the casing of the hex digits?

Without implementing composable modifiers adding new modifiers is getting harder as the number of interactions with other modifiers explode.

plusvic avatar Nov 10 '21 14:11 plusvic