yara
yara copied to clipboard
Feature: 'hex' keyword
A new keyword hex
that encodes a string could improve the rule writing process to face the rise of malicious embedded scripts in OLE objects.
Instead of the string
$s1 = "68007400740070003a002f002f00"
a user could write
$s1 = "http://" wide hex
and other users could understand the rule without a comment.
/* http:// - wide and hex encoded */
$s1 = "68007400740070003a002f002f00"
Update 14.11.17: corrected the request - see the comment below
But that's exactly what the wide
modifier does, right? I mean, these two strings are equivalent:
$s1 = "http://" wide
$s1 = { 68 00 74 00 74 00 70 00 3a 00 2f 00 2f 00 }
Ah - no. let me explain it differently. The objects are embedded in a hex encoded form. (see the screenshot)
If you hex decode the strings, you'll get the VBA code in wide formatting. (not sure if this is always the case but I haven't seen ASCII code)
So, it would be easier for a user to to write
$vba1 = "10.24.113.102/office.png" hex wide
Instead of
$vbacode1 = "00310030002E00320034002E00310033002E003100300032002F006F00660066006900630065002E0070006E0067"
(note that this string and no byte chain)
I propose this feature because
- I have seen a lot of malicious scripts and obfuscation stuff for which I wrote rules with strings like
$vbacode1
recently and the number of malicious embedded content is rising - I though that it wouldn't be that difficult to implement
Possible problem:
- The
nocase
keyword and its application
I see that I made an error in the initial request. I used the {
instead of "
symbols.
My fault.
I realize this is an old ticket, but I'd like to just mention that I too would find the hex
modifier incredibly helpful. Like Florian, my use-case is also related to OLE objects. Specifically, I was working with some RTFs recently where the attackers used a single-byte XOR key to obfuscate an embedded payload. This XOR key of course has the ability to differ from sample to sample. In an ideal world I would write something like the following:
$uniq_string = "GET /malware.png HTTP/1.1" hex xor
This would a) iterate through all xor permutations of the string and b) convert the string to it's hex-encoded value. Unfortunately, the only solution at this time that I'm aware of to tackle this problem is to use a programming language to generate all permutations of this string in its hex-encoded representation, and do a 1 of ($uniq_string*)
on all of those permutations. This of course makes for an incredibly large YARA rule when there are multiple strings you're checking for. Such a rule often cannot be ingested by certain systems due to the size.
Hello!
I'd like to add my support for adding a hex modifier. Here are some possible details to consider:
- Hex should be applied after base64/xor modifier, but before ascii/wide modifier.
- Hex should be nocase by default, since most of the time code/things that ingest hex characters are case-insensitive.
Here's an example of two rules that might be equivalent with the new hex keyword:
rule example
{
strings:
$a = "foobar" hex ascii wide
condition:
any of them
}
rule equivalent_example
{
strings:
$a = "666f6f626172" nocase ascii wide
condition:
any of them
}
Whilst it is possible to add these rules at present, they would be more readable if there were a "hex" keyword.
Cheers, Tom
See #1249 for more context on adding additional modifiers.
It appears that new modifiers will not be added at this point, and something like composable modifiers are needed before additional modifiers are added.
I don't know if/when this new modifier is going to be considered but I'd think about changing it to "hexstring" or something similar. Otherwise users could be confused as an "hex" modifier may suggest it matches both "foobar" and { 66 6f 6f 62 61 72 } (the hex equivalent of its chars). 😉
Signing up for this. My use case revolves around finding payloads in Msoffice documents, which are a common vector of sending malware through email.
As VBA (the language used in macro programming for that software) is case-insensitive, dealing with hex strings is a pain, as permutations on capitalization are quite high and some implementations for yara rules are quite limited.
Something like:
$string = "a fragment of FBA code" nocase hex
Would be very much appreciated for dealing with these kinds of issues, although I don't know if that would be possible.
This feature would be helpful.
As discussed in https://gist.github.com/wxsBSD/44aa8b8133e3ea96e738b66ec1c600f2, the problem with adding more modifiers is defining how they interact with other modifiers. For example, if you say "foo" nocase hex
what does it means exactly? Does it means that it should match the uppercase (or lowercase) hex representation of "foo", "Foo", "FOO", and all the other case combinations of "foo"? Or does mean that it should match the hex representation of "foo" no matter the casing of the hex digits?
Without implementing composable modifiers adding new modifiers is getting harder as the number of interactions with other modifiers explode.