ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Issues representing structs with alignment in Golang's ABIInternal calling convention

Open mattiasgrenfeldt opened this issue 1 year ago • 22 comments

I'm not sure if this is a feature request, bug report or a discussion. If you think it should be a discussion, feel free to move it.

Is your feature request related to a problem? Please describe. I'm working on a Go plugin and I'm having trouble representing the new (since Go 1.17) register based calling convention ABIInternal (documented here: https://go.dev/s/regabi) on x86-64. Based on what has been said in other issues (https://github.com/NationalSecurityAgency/ghidra/issues/692), it seems like you are working on better support for representing Go already and that currently you can't represent things nicely, but I just wanted to add in my tricky case for you to consider.

I've run into the problem that, according to ABIInternal, structs are represented one way when sent in registers and another when on the stack. As soon as some struct needs alignment padding, I'm not able to represent it in Ghidra in a nice way. Let's look at two examples:

Example 1:

type Foo struct {
    A int16
    B int64
}

When passed in registers, Foo should be sent like this, according to the calling convention:

A - RAX
B - RBX

When passed on the stack, we instead get some alignment filler bytes between A and B since int64 requires alignment of 8 bytes:

[A][6 padding bytes][B]

We'll come back to the solutions I've attempted below, but let's look at the second example first.

Example 2:

type Bar struct {
    X int16
    Y int16
    Z int64
}

Here's how it's passed in registers:

X - RAX
Y - RBX
Z - RCX

Notice how, even though X and Y could be crammed into the same register, for example EAX, they are passed in separate registers.

Here's how it is passed on the stack:

[X][Y][4 bytes padding][Z]

Describe alternatives you've considered I've figured out two different approaches to representing the above example structs. Both approaches fail in some way.

Approach 1: Use the struct packing feature and rely on Ghidra's alignment calculations. Using this, Foo would be represented as such:

struct Foo {
    short A;
    // 6 'undefined' filler bytes. Not visible due to packing, but still present in the struct.
    longlong B;
}

When this struct is passed on the stack, everything is good. Here is how I represent it in custom storage when sent in registers:

A + 6 filler bytes - RAX
B - RBX

Notice that storage must be assigned even for the filler bytes, even though they don't "matter". Now consider that we are looking the following Go code (the full code is uploaded here: alignment_issue.go):

func main() {
	f := createFoo()
	printFoo(f)
	forceSpilling(5, 6)
	printFoo(f)
}

With the above custom storage assigned to the return value of createFoo() and argument to printFoo(), Ghidra gives the following:

  foo = main.createFoo();
  main.printFoo(foo);
  main.forceSpilling(5,6);
  main.printFoo((Foo)CONCAT88(SUB168((undefined  [16])foo,8),(ulong)SUB162((undefined  [16])foo,0)));

For the first call to printFoo(), foo is still stored in RAX and RBX from returning from createFoo(). But when forceSpilling() is called, foo must be spilled to the stack. So for the second call to printFoo(), foo must be loaded into registers from the stack. This is done by the following assembly (full assembly here: asm.txt):

MOVZX           EAX,word ptr [RSP + local_12]
MOV             RBX,qword ptr [RSP + local_10]
CALL            main.printFoo

This is probably what gives rise to the ugliness in the second call in the decompilation. The Go assembly doesn't care what the contents is of the filler bytes, so it fills them with zeroes using the MOVZ EAX,word ptr [RSP + local_12]. But Ghidra does care and does show that the A field is cast to a ulong before being concatenated with field B.

Okay, that was a bit ugly, but still functional. Let's now look at how Bar is represented as a struct in Ghidra using this approach:

struct Bar {
    short X;
    short Y;
     // 4 'undefined' filler bytes. Not visible due to packing, but still present in the struct.
    longlong Z;
}

For passing in registers, Bar can't be represented using custom storage:

X - AX
Y + 4 filler bytes - No version of RBX exists which is 6 bytes long.
Z - RCX

It is also not possible to skip or ignore the filler bytes in custom storage since the size of the registers must sum to the size of the struct.

Approach 2: Make the structs in Ghidra without any filler bytes. Example for Foo:

struct Foo {
    short A;
    longlong B;
} // No padding bytes anywhere!

This way, it is possible to represent when the structs are passed in registers using custom storage, but as soon as the struct is put on the stack, the variable will be incorrect, since the alignment is missing. Sometimes the struct is passed on the stack instead of in registers, I've tried to use custom storage in this case and assign two stack ranges, skipping the alignment bytes (Stack[0x8]:2,Stack[0x10]:8), but only one stack range is allowed.

I would prefer for Approach 1 to work, but for the moment I am going with Approach 2 and accepting that structs which need alignment padding will be incorrect on the stack.

Describe the solution you'd like I don't know any good solutions for this. I just want a way to somehow represent these things more cleanly. To me, the most obvious way to fix Approach 1 would be to somehow allow "ignoring" the filler bytes when assigning custom storage, but then the semantics of those bytes getting zeroed out sometimes would disappear from the decompilation.

I don't like Approach 2, because the struct doesn't say that the alignment is happening, it feels incorrect. But the most obvious way to allow that approach would be to allow multiple disjoint stack ranges as variable storage.

Additional context Here are some related issues I found:

  • https://github.com/NationalSecurityAgency/ghidra/issues/692 - Here it is mentioned that:
    • "We've got an analyzer working its way through the pipeline before its ready for release." - Do you have any planned release date for this?
    • "You won't be able to create a ghidra call spec that truly matches go's real behavior. Maybe for very simple functions it will match." - :cry:
  • https://github.com/NationalSecurityAgency/ghidra/issues/688 - Extra CONCATs and SUBs are often encountered when structs are passed in registers.
  • https://github.com/NationalSecurityAgency/ghidra/issues/4052 - Using <pentry> with a bunch of join pieces works okay in my .cspec when functions are simple. Here is an example for 32 byte arguments/return values:
<!-- 32 byte arguments -->
<pentry minsize="25" maxsize="32">
    <addr space="join" piece1="RDI" piece2="RCX" piece3="RBX" piece4="RAX"/>
</pentry>
  • However, it is very inconvenient that the pieceN attributes only goes up to piece4. This means that as soon as something is larger than 32 bytes you have to resort to custom storage.

mattiasgrenfeldt avatar Mar 24 '23 10:03 mattiasgrenfeldt

Good write-up on the issues.

Some of these issues I've run into (some I glossed over so thanks for highlighting them), and some of the issues are addressed recently (ie. max number of pieces was increased, excessive CONCATs), but fundamental issues remain.

You are correct, there are incompatible memory layout requirements for stack passed vs. register passed structures, and no easy way to add some filler storage to occupy the padding locations. I'm not sure what the solution will be for this.

dev747368 avatar Mar 24 '23 18:03 dev747368

I wanted to experiment with how it would look if there did exist versions of the registers with sizes 3, 5, 6, and 7. So I attempted to copy the x86-64.sla file and add them, but it didn't go well. The new registers didn't show up in the Custom Storage dropdown afterwards. I probably did something wrong though.

In the copied .sla file, which i named x86-64.more-registers.sla I added these two entries for each new register (example for RAX with size 3):

<varnode_sym name="RAX3" id="0x359" scope="0x0" space="register"  offset="0x0" size="3"></varnode_sym>
<varnode_sym_head name="RAX3" id="0x359" scope="0x0"/>

I generated the entries with this script: more-registers.py. The full .sla-file looks like this: x86-64.more-registers.sla.

I don't know the connection between the .sla-file and the .slaspec file, but I assumed that they need to be called the same, so I also copied x86-64.slaspec to x86-64.more-registers.slaspec in my /data/languages folder. I also copied all .sinc files, but didn't rename them. So the /data/languages directory in my extension looks like this:

adx.sinc          bmi1.sinc  golang1.19.5.cspec  mpx.sinc        smx.sinc
avx2_manual.sinc  bmi2.sinc  golang.ldefs        pclmulqdq.sinc  x86-64.more-registers.sla
avx2.sinc         cet.sinc   ia.sinc             rdrand.sinc     x86-64.more-registers.slaspec
avx_manual.sinc   clwb.sinc  lzcnt.sinc          sgx.sinc
avx.sinc          fma.sinc   macros.sinc         sha.sinc

golang.ldefs looks like this:

<?xml version="1.0" encoding="UTF-8"?>

<language_definitions>
   <language processor="x86"
            endian="little"
            size="64"
            variant="default"
            version="1.0"
            slafile="x86-64.more-registers.sla"
            processorspec="x86-64.pspec"
            manualindexfile="../manuals/x86.idx"
            id="x86:LE:64:golang-1.19.5">
    <description>Go Language Module</description>
    <compiler name="go1.19.5" spec="golang1.19.5.cspec" id="default"/>
  </language>
</language_definitions>

I can start Ghidra, load the file, disassemble and decompile it. There doesn't seem to be any issue. But when I look in the Custom Storage dropdown to select registers, I don't find my new ones

mattiasgrenfeldt avatar Apr 04 '23 08:04 mattiasgrenfeldt

You are kinda going in the same direction I am, however modifying a language is not going to be something we do. Previously I was playing with the idea of using parts of the the XMM15 register (golang initializes it to value 0 and I defined it that way in my tests) as locations for padding, but ruled that out as not a 100% solution.

Right now I'm looking at using parts of the "unique" pseudo address space to occupy the padding locations in structures.

dev747368 avatar Apr 04 '23 21:04 dev747368

Right now I'm looking at using parts of the "unique" pseudo address space to occupy the padding locations in structures.

I tried this already and got this exception: ghidra.util.exception.InvalidInputException: Hash, Unique and Constant storage may only use a single varnode Trying a different address space, I got this: ghidra.util.exception.InvalidInputException: Compound storage must use registers except for last varnode

monoidic avatar Apr 13 '23 17:04 monoidic

Yes, that is an issue with the code as it currently exists. For my tests, I disabled those checks and was able to get it to work, but its not a pretty thing when the user sees a strange address space and random offsets as the location of their custom storage.

We are still investigating a better solution.

dev747368 avatar Apr 13 '23 17:04 dev747368

Good write-up on the issues.

Some of these issues I've run into (some I glossed over so thanks for highlighting them), and some of the issues are addressed recently (ie. max number of pieces was increased, excessive CONCATs), but fundamental issues remain.

You are correct, there are incompatible memory layout requirements for stack passed vs. register passed structures, and no easy way to add some filler storage to occupy the padding locations. I'm not sure what the solution will be for this.

The CONCATS have definitely gotten better (for register storage, haven't checked stack storage). However, it appears that the decompiler completely ignores the return value when returning a structure stored in multiple registers as seen in the go calling convention.

astrelsky avatar Apr 19 '23 17:04 astrelsky

However, it appears that the decompiler completely ignores the return value when returning a structure stored in multiple registers as seen in the go calling convention.

I'm not seeing that with my examples.

func test_return_struct() string {
        return "testing 123"
}

comes out as

string main.test_return_struct(void)

{
  string sVar1;
  
  sVar1.len = 0xb;
  sVar1.str = &DAT_00499a3e;
  return sVar1;
}

(where the return is in rax,rbx)

dev747368 avatar Apr 19 '23 21:04 dev747368

However, it appears that the decompiler completely ignores the return value when returning a structure stored in multiple registers as seen in the go calling convention.

I'm not seeing that with my examples.

func test_return_struct() string {
        return "testing 123"
}

comes out as

string main.test_return_struct(void)

{
  string sVar1;
  
  sVar1.len = 0xb;
  sVar1.str = &DAT_00499a3e;
  return sVar1;
}

(where the return is in rax,rbx)

func main.main() {

    println(main.test_return_struct())
}

Will decompile as

void main.main

{ // extra newline to force the garbage decompiler "c with classes" coding convention in the decompiler output for no reason


    // predeclared stack variable garbage from the 80s

    main.test_return_struct();
    str.data = garbageVariable1;
    str.length = garbageVariable2;
    println(str);

    return; // unnecessarily verbose return statement
}

The result of the call to test_return_struct is discarded.

It's probably easiest to see if you can get the compiler to use runtime.comcatstring3

astrelsky avatar Apr 19 '23 21:04 astrelsky

Dunno. I get:

void main.main(void)

{
  string s;
  
  while (&stack0x00000000 <= CURRENT_G.stackguard0) {
    runtime.morestack_noctxt();
  }
  s = main.test_return_struct();
  runtime.printlock();
  runtime.printstring(s);
  runtime.printnl();
  runtime.printunlock();
  return;
}

go was:

func main() {
        println(test_return_struct())
}

a more complicated test with string concats gets me:

void main.main2(void)

{
  string sVar1;
  string a1;
  uint8 local_30 [32];
  uint8 *local_10;
  
  while (&stack0x00000000 <= CURRENT_G.stackguard0) {
    runtime.morestack_noctxt();
  }
  sVar1 = main.test_return_struct();
  a1.len = 8;
  a1.str = (uint8 *)"appended";
  sVar1 = runtime.concatstring3((runtime.tmpBuf *)local_30,sVar1,a1,sVar1);
  local_10 = sVar1.str;
  runtime.printlock();
  sVar1.str = local_10;
  runtime.printstring(sVar1);
  runtime.printnl();
  runtime.printunlock();
  return;
}

from

func main2(){
        str := test_return_struct()
        str = str + "appended" + str
        println(str)
}

dev747368 avatar Apr 20 '23 13:04 dev747368

@dev747368 nevermind go is confusing. I didn't notice that the documentation says the register index gets reset to 0 before assigning the return registers. So I was assigning them wrong.

Now that I've sorted out my own stupidity (a little bit at least) it is much more workable.

astrelsky avatar Apr 20 '23 13:04 astrelsky

@dev747368 Is you CURRENT_G a local variable that you have manually named and typed or have you set up something so that there is always an assumed variable in every function with storage R14 and type runtime.g*? Because that would be nice to have, I'm just not sure how to do it.

mattiasgrenfeldt avatar Apr 21 '23 13:04 mattiasgrenfeldt

Is you CURRENT_G a local variable that you have manually named and typed or have you set up something so that there is always an assumed variable in every function with storage R14 and type runtime.g*?

In my analyzer I'm globally setting R14 to the address of an artificial block of memory that I create that holds a runtime.g struct. I'm also setting XMM15 to 0

 Register currentGoroutineReg = ...;
 Address gAddr = ...;
 AddressSpace space = program.getAddressFactory().getDefaultAddressSpace();
 program.getProgramContext().setValue(currentGoroutineReg, space.getMinAddress(),
     space.getMaxAddress(), gAddr.getOffsetAsBigInteger());

(which is the gui equivalent of selecting all memory, right click, "Set Register Values", and setting R14 or whatever to the target)

dev747368 avatar Apr 21 '23 15:04 dev747368

Is you CURRENT_G a local variable that you have manually named and typed or have you set up something so that there is always an assumed variable in every function with storage R14 and type runtime.g*?

In my analyzer I'm globally setting R14 to the address of an artificial block of memory that I create that holds a runtime.g struct. I'm also setting XMM15 to 0

 Register currentGoroutineReg = ...;
 Address gAddr = ...;
 AddressSpace space = program.getAddressFactory().getDefaultAddressSpace();
 program.getProgramContext().setValue(currentGoroutineReg, space.getMinAddress(),
     space.getMaxAddress(), gAddr.getOffsetAsBigInteger());

(which is the gui equivalent of selecting all memory, right click, "Set Register Values", and setting R14 or whatever to the target)

You really only need to do the .text memory block. This works really well but I can't decide on whether the decompilation looks better with the jump set as a call return or as a branch.

I found little to no actual documentation on what a closure is supposed to look like for RDX but I did just encounter it being used once. I think it just comes out to struct {void *fun; void *arg1, void *arg2, ... }; but I don't know what it looks like if there are more arguments.

Also, watch out for the duffcopy and duffzero functions, the calling convention is different. I basically set the inputs to rsi and rdi and all other registers except xmm0 as unaffected.

I've also been using a call fixup for the small functions that can runtime.gcWriteBarrier where the fixup is effectively a nop.

For a good time find the function that initializes all the syscall.LazyProc pointers (syscall.init).

astrelsky avatar Apr 21 '23 18:04 astrelsky

@dev747368 Nice approach with using register assumptions and a labeled piece of memory! I have also used register assumptions to set XMM15 to 0, but I didn't think that you could use it for R14 as well. How are you making the artificial block of memory? What address are you giving it?


@astrelsky

This works really well but I can't decide on whether the decompilation looks better with the jump set as a call return or as a branch.

What are you referring to here?

I found little to no actual documentation on what a closure is supposed to look like [...]

From what I've understood the closure context structure looks like this struct {void *fun; T1 var1, T2 var2, ... };, where var1 and var2 are the closed over variables. They are laid out consecutively after each other in the struct. The actual arguments to the function are sent as normal when it is called, in RAX, RBX, etc.

Also, watch out for the duffcopy and duffzero functions, [...]

I've made specific calling conventions for them in my .cspec. :+1:

I've also been using a call fixup for the small functions that can runtime.gcWriteBarrier where the fixup is effectively a nop.

Good idea with using a call fixup! But won't you then still have the if-statement with an empty branch left? My plan was to patch away the entire if-statement and the gcWriteBarrierXX branch.

mattiasgrenfeldt avatar Apr 25 '23 07:04 mattiasgrenfeldt

@dev747368 Nice approach with using register assumptions and a labeled piece of memory! I have also used register assumptions to set XMM15 to 0, but I didn't think that you could use it for R14 as well. How are you making the artificial block of memory? What address are you giving it?


@astrelsky

This works really well but I can't decide on whether the decompilation looks better with the jump set as a call return or as a branch.

What are you referring to here?

The use of a pseudo memory block for runtime.g and setting the register value of R14 to it's address. For the jump being a call return or branch I'm referring to the jump back to the function entry point after acquiring more stack space.

I've also been using a call fixup for the small functions that can runtime.gcWriteBarrier where the fixup is effectively a nop.

Good idea with using a call fixup! But won't you then still have the if-statement with an empty branch left? My plan was to patch away the entire if-statement and the gcWriteBarrierXX branch.

I don't see any empty if statements. They are most likely eliminated by the decompiler.

I'm of the opinion that patching should be avoided like covid unless you are actually looking to patch and export the program.

astrelsky avatar Apr 25 '23 10:04 astrelsky

I'm of the opinion that patching should be avoided like covid unless you are actually looking to patch and export the program.

Why? If it can be used to clean up and remove irrelevant things I don't see a problem with it. My guess is that you don't like it because it might hide things from the user.

mattiasgrenfeldt avatar Apr 25 '23 12:04 mattiasgrenfeldt

I'm of the opinion that patching should be avoided like covid unless you are actually looking to patch and export the program.

Why? If it can be used to clean up and remove irrelevant things I don't see a problem with it. My guess is that you don't like it because it might hide things from the user.

Nothing is irrelevant. It's not easily undoable and there are built in mechanisms specifically to deal with these sort of things. It is more effective to tell the decompiler how to treat it then it is patch it.

Most importantly, having to patch a program to get it to decompile or load or analyzed properly because there's no other alternative to patching, is a crappy ida thing.

Also I just discovered that you are correct, there is still a lingering if statement. Imo it's not that much of a bother though.

astrelsky avatar Apr 27 '23 00:04 astrelsky

Also, watch out for the duffcopy and duffzero functions, the calling convention is different. I basically set the inputs to rsi and rdi and all other registers except xmm0 as unaffected.

I've found that marking them as inline eliminates a lot of noise in the decompiler output, but its been a while since I last played with it.

dev747368 avatar Apr 29 '23 01:04 dev747368

For the jump being a call return or branch I'm referring to the jump back to the function entry point after acquiring more stack space.

So, the JMP at the bottom of the function getting marked as call-return (and a self-recursive call showing up at the bottom of your function in the decompiler) is due to a bad interaction with the Shared Return Calls analyzer. I recommend turning it off, as well as the Apply Data Archives analyzer (there are some symbol names that match but you don't want what it's offering)

dev747368 avatar Apr 29 '23 02:04 dev747368

Also, watch out for the duffcopy and duffzero functions, the calling convention is different. I basically set the inputs to rsi and rdi and all other registers except xmm0 as unaffected.

I've found that marking them as inline eliminates a lot of noise in the decompiler output, but its been a while since I last played with it.

I mean no disrespect but I highly doubt that for a few reasons.

  1. For an inline function A called in function B, if function A is called multiple times in function B then inlining will fail.
  2. sVar1._0_16 = *(undefined [16] *)0; is not clean. duffzero_size_offset(sVar1); is perfect.

duffzero only uses RDI and RSI (since XMM15 is 0) and duffcopy only uses RDI, RSI and XMM0 so modeling a calling convention for it is trivial and the output for it is perfect.

For the gcWriteBarrier and gcWriteBarrierSI/gcWriteBarrierBX, etc the call fixups work great. However, I can't figure out how to appropriately tell the decompiler that some registers are killed or if I even need to. As for the remaining if statement with the write barrier mechanism I did the following. I located the data being checked in it's memory block and split that block into 3 pieces so that the 4 bytes of data for the check is in it's own block. I then initialized the block of 4 bytes to 0, created a dword and then changed the mutability to constant. The decompiler will then remove the if statement since it will think it is always 0 and the if condition is always true (which is what we want so the write always occurs).

astrelsky avatar Apr 29 '23 11:04 astrelsky

I just tried to do this to fix the input parameters for the abi internal calling convention and it complained: 2023-10-28 06:47:12 ERROR (ProgramDB) Compiler Spec golang for Language Intel/AMD 64-bit x86 Not Found, using default: ghidra.program.model.lang.CompilerSpecNotFoundException: Exception reading x86:LE:64:default/golang(x86-64-golang.cspec): <pentry> in the join space not allowed in <group> tag

<prototype name="abi-internal" extrapop="8" stackshift="8">
	<input>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM0_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM1_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM2_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM3_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM4_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM5_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM6_Qa"/>
		</pentry>
		<pentry minsize="4" maxsize="8" metatype="float">
			<register name="XMM7_Qa"/>
		</pentry>

		<group>
			<pentry minsize="1" maxsize="8">
				<register name="RAX"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="RAX" piece1="RBX"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="RAX" piece2="RBX" piece1="RCX"/>
			</pentry>
			<pentry minsize="25" maxsize="32">
				<addr space="join" piece4="RAX" piece3="RBX" piece2="RCX" piece1="RDI"/>
			</pentry>
			<pentry minsize="33" maxsize="40">
				<addr space="join" piece5="RAX" piece4="RBX" piece3="RCX" piece2="RDI" piece1="RSI"/>
			</pentry>
			<pentry minsize="41" maxsize="48">
				<addr space="join" piece6="RAX" piece5="RBX" piece4="RCX" piece3="RDI" piece2="RSI" piece1="R8"/>
			</pentry>
			<pentry minsize="49" maxsize="56">
				<addr space="join" piece7="RAX" piece6="RBX" piece5="RCX" piece4="RDI" piece3="RSI" piece2="R8" piece1="R9"/>
			</pentry>
			<pentry minsize="57" maxsize="64">
				<addr space="join" piece8="RAX" piece7="RBX" piece6="RCX" piece5="RDI" piece4="RSI" piece3="R8" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="65" maxsize="72">
				<addr space="join" piece9="RAX" piece8="RBX" piece7="RCX" piece6="RDI" piece5="RSI" piece4="R8" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<register name="RBX"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="RBX" piece1="RCX"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="RBX" piece2="RCX" piece1="RDI"/>
			</pentry>
			<pentry minsize="25" maxsize="32">
				<addr space="join" piece4="RBX" piece3="RCX" piece2="RDI" piece1="RSI"/>
			</pentry>
			<pentry minsize="33" maxsize="40">
				<addr space="join" piece5="RBX" piece4="RCX" piece3="RDI" piece2="RSI" piece1="R8"/>
			</pentry>
			<pentry minsize="41" maxsize="48">
				<addr space="join" piece6="RBX" piece5="RCX" piece4="RDI" piece3="RSI" piece2="R8" piece1="R9"/>
			</pentry>
			<pentry minsize="49" maxsize="56">
				<addr space="join" piece7="RBX" piece6="RCX" piece5="RDI" piece4="RSI" piece3="R8" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="57" maxsize="64">
				<addr space="join" piece8="RBX" piece7="RCX" piece6="RDI" piece5="RSI" piece4="R8" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<addr space="join" piece1="RCX"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="RCX" piece1="RDI"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="RCX" piece2="RDI" piece1="RSI"/>
			</pentry>
			<pentry minsize="25" maxsize="32">
				<addr space="join" piece4="RCX" piece3="RDI" piece2="RSI" piece1="R8"/>
			</pentry>
			<pentry minsize="33" maxsize="40">
				<addr space="join" piece5="RCX" piece4="RDI" piece3="RSI" piece2="R8" piece1="R9"/>
			</pentry>
			<pentry minsize="41" maxsize="48">
				<addr space="join" piece6="RCX" piece5="RDI" piece4="RSI" piece3="R8" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="49" maxsize="56">
				<addr space="join" piece7="RCX" piece6="RDI" piece5="RSI" piece4="R8" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<addr space="join" piece1="RDI"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="RDI" piece1="RSI"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="RDI" piece2="RSI" piece1="R8"/>
			</pentry>
			<pentry minsize="25" maxsize="32">
				<addr space="join" piece4="RDI" piece3="RSI" piece2="R8" piece1="R9"/>
			</pentry>
			<pentry minsize="33" maxsize="40">
				<addr space="join" piece5="RDI" piece4="RSI" piece3="R8" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="41" maxsize="48">
				<addr space="join" piece6="RDI" piece5="RSI" piece4="R8" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<addr space="join" piece1="RSI"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="RSI" piece1="R8"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="RSI" piece2="R8" piece1="R9"/>
			</pentry>
			<pentry minsize="25" maxsize="32">
				<addr space="join" piece4="RSI" piece3="R8" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="33" maxsize="40">
				<addr space="join" piece5="RSI" piece4="R8" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<addr space="join" piece1="R8"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="R8" piece1="R9"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="R8" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="25" maxsize="32">
				<addr space="join" piece4="R8" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<addr space="join" piece1="R9"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="R9" piece1="R10"/>
			</pentry>
			<pentry minsize="17" maxsize="24">
				<addr space="join" piece3="R9" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<group>
			<pentry minsize="1" maxsize="8">
				<addr space="join" piece1="R10"/>
			</pentry>
			<pentry minsize="9" maxsize="16">
				<addr space="join" piece2="R10" piece1="R11"/>
			</pentry>
		</group>
		<pentry minsize="1" maxsize="8">
			<register name="R11"/>
		</pentry>

		<pentry minsize="1" maxsize="500" align="8">
			<addr offset="8" space="stack"/>
		</pentry>
	</input>
...

astrelsky avatar Oct 28 '23 10:10 astrelsky

Never mind I forgot you have to lay it out without the groups and it should work.

astrelsky avatar Oct 31 '23 22:10 astrelsky