serenity icon indicating copy to clipboard operation
serenity copied to clipboard

Create typing system

Open nyxxxie opened this issue 7 years ago • 5 comments

We need a way to define types to be used with the template system. This feature should be implemented in two parts:

  • Type: Provides to_string/from_string transforms and indication of size in memory. Uniquly identified by a string (EG: "int32_t").
  • Typesystem: Manages and stores types. Ensures there are no duplicates and allows for easy access via a for loop or type name lookup.

In addition, we should provide certain builtin types off the bat (byte, char, etc). These should be placed in their own directory in the project called types.

nyxxxie avatar Jan 30 '17 21:01 nyxxxie

Template system should have at least these types defined: i8, u8, i16le, i16be, u16le, u16be, i32le, i32be, u32le, i64le, i64be, u64le, u64be, ieee754f32le, ieee754f32be, ieee754f64le, ieee754f64be, ieee754f128le, ieee754f128be and intelf80 Other types should alias to these types above depending on settings. i8, int8_t to i8 i16, int16_t to i16le or i16be i32, int32_t to i32le or i32be i64, int64_t to i64le or i64be

u-prefixed types should alias respectively. C types must follow C specification. That means char aliases to signed char or unsigned char depending on project settings. char may alias to signed char or unsigned char signed char may alias to i8, i16, i32, i64 unsigned char may alias to u8, u16, u32, u64 short and int may alias to i16 or bigger long may alias to i32 or bigger long long may alias to i64 or bigger

The unsigned variants must alias to their unsigned counterparts respectively. Don't forget that short, short int, signed short and signed short int is the same type. This is same for long and long long. Additionally, floating point number types must follow these rules: float usually is an alias to ieee754f32le but any floating point type may be set double usually is an alias to ieee754f64le but any floating point type may be set long double may alias to ieee754f64le, ieee754f128le or intelf80 or other

Additionally, a non-standard type wchar_t may alias to i16, u16 or larger.

nanokatze avatar Jan 30 '17 21:01 nanokatze

Spade should probably include alias definitions for various platforms (windows-x86-msvc, windows-x64_64-msvc, windows-x86-gcc, windows-x86_64-gcc, linux-x86, linux-x86_64, m68k etc, platform-arch-compiler) out of box so that users could quickly select the one they want instead of configuring every alias manually (though this should be kept as an option).

nanokatze avatar Jan 30 '17 21:01 nanokatze

I think some of your proposed types can be added as a seperate feature (for example le,be versions of types). Let's define the default set of types for successful completion of this feature to be byte and char, and add in the additional builtins in a new feature. I do agree the typesystem should be able to support aliasing (creating type that refers to another type).

nyxxxie avatar Jan 30 '17 21:01 nyxxxie

This comment is more of a note to myself. Seeing that, in #34 we want to add dynamic-sized types, the current design that is being developed may be inadequate for the task. An alternate design for this typesystem feature could be to have the typemanager store the python class of a type itself (as opposed to a class object as we currently do), and then have the object serve as a representation of a series of bytes. Each type will take in the bytes through the constructor or through from_string() or from_bytes() functions (to allow reuse), and then will allow querying of the stored data using to_string() and to_bytes(), and size() methods. Usage, then, would look like this:

int_bytes = b"\xde\xad\xbe\xef" # Some bytes we get from some random source

# We can get the Int32 type parser from a typesystem...
from spade.typesystem.typemanager import TypeManager
tm = TypeManager() # Initializes typesystem and adds default types
Int32 = tm.get_type("int32") # Returns usable class

# ...or we can just import it directly, since it's a default type
from spade.typesystem.types.int32 import Int32

# Usage 1 (The standard way to use it)
int_parse = Int32(int_bytes)
print(int_parse.to_string())
print(str(int_parse)) # Make str call to_string because why not

# Usage 2 (more verbose)
int_parse = Int32()
int_parse.from_bytes(int_bytes)
print(int_parse.to_string())

# Usage 3 (similar to typecasting)
print(Int32(int_bytes).to_string())

This above design also allows us to write various utility functions, like one that will take in a byte stream and extract as many types from that byte stream as there are. For instance, if we fed this function 9 bytes and asked it to parse out all int32's, then it'd return 2 Int32 objects representing both integers.

nyxxxie avatar Apr 04 '17 15:04 nyxxxie

I merged the final interface for implementing types as well as implementations for char, byte, int32le, and uint32le. I'll keep this issue open until I've implemented all (or most) types mentioned above.

nyxxxie avatar May 29 '17 10:05 nyxxxie