redstar-tools icon indicating copy to clipboard operation
redstar-tools copied to clipboard

Encoding for AnGae.dat

Open seamustuohy opened this issue 8 years ago • 4 comments

I have been fooling around with stripping out the AnGae.dat text stings and have been having difficulty getting them to show up propoerly. What encoding did you use for the final text? I have been using UTF-16 LE, but that has mostly resulted in gibberish strings.

seamustuohy avatar Aug 12 '16 13:08 seamustuohy

Hi! Did you ever get anywhere with this?

I'm trying to do something similar - I'm attempting to convert AnGae.dat to Yara rules (basically just dump out the hex characters) but it looks like I'm doing something wrong. Attempting to convert to UTF-16 didn't lead to any Korean strings for me.

If you're still looking at this, or anyone else is, the code I attempted to create yara rules with is below ->


#!/usr/bin/env python2
# -*- coding: utf-8 -*-
import sys
import struct
import binascii


f = open(sys.argv[1])

timestamp, = struct.unpack('<I', f.read(4))

unknown1 = f.read(1000)

package_id, = struct.unpack('<I', f.read(4))
unknown2, = struct.unpack('<I', f.read(4))
pattern_date, = struct.unpack('<I', f.read(4))
file_count, = struct.unpack('<I', f.read(4))
head_pos, = struct.unpack('<I', f.read(4))
real_size, = struct.unpack('<I', f.read(4))

pattern_count, = struct.unpack('<Q', f.read(8))

print('timestamp: {}'.format(timestamp))
print('pattern count: {}'.format(pattern_count))

patterns = []

for i in range(pattern_count):
    pattern = dict()
    pattern['reclen'], = struct.unpack('<I', f.read(4))
    pattern['package_id'], = struct.unpack('<I', f.read(4))
    pattern['content'], = struct.unpack('<200s', f.read(200))
    patterns.append(pattern)

pattern_checksum, = struct.unpack('<20s', f.read(20))


count = 0

for pattern in patterns:
    try:
        count = count + 1
        hex = binascii.hexlify(pattern['content'])
        print 'rule kr_yara' + str(count) + ' {'
        print ' strings: '
        print '  $a_' + str(count) + ' = { ' + hex + ' }'
        print 'condition:'
        print ' all of them'
        print '}'
    except Exception as ex:
        pass

chrisdoman avatar Jan 12 '18 23:01 chrisdoman

I have been fooling around with stripping out the AnGae.dat text stings and have been having difficulty getting them to show up propoerly. What encoding did you use for the final text? I have been using UTF-16 LE, but that has mostly resulted in gibberish strings.

It looks like they are little endian, so switch to bytes and they appear to be UTF-16 strings

SpiraMirabilis avatar Oct 21 '19 09:10 SpiraMirabilis

@SpiraMirabilis do you have a script you can share for extracting the strings?

willscott avatar Nov 05 '19 18:11 willscott

Please share the script that works. Because I cannot make the script working by putting encoding=utf_16_le

Getting error:

UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 1080-1081: illegal encoding

Any help will be apreciated!

@SpiraMirabilis @takeshixx

Karmakstylez avatar Mar 11 '20 19:03 Karmakstylez