Rosalia64 icon indicating copy to clipboard operation
Rosalia64 copied to clipboard

Research: Findings from the Microsoft Blogs

Open Eeveelution opened this issue 2 years ago • 0 comments

GR0 similar to PR0 is hardwired to 0 (PR0 is hardwired to 1) and writing to it triggers a processor exception.

GR1 is called the Global Pointer and points to the current function's global variables because Itanium has no absolute addressing mode.

In the Win32 calling convention for Itanium GR8...GR11 are used for return values GR12 is the stack pointer (unknown if Itanium generally or Win32 only)

the NotAThing bit is used for speculative execution to indicate the Value of the register isn't valid yet. Accessing such registers in for example arithmetic operations will spread the NotAThing bit to other Registers aswell, and alot of instructions disallow NotAThing'ed registers meaning uninitialized variable access could lead to a program crash.

FR0 is hardwired to 0.0 FR1 is hardwired to 1.0

same as GR's GR/FR 0 through 31 are static, 32 to 127 are rotating. Through the Win32 calling convention however FR0...FR5 and FR16...FR31 are preserved across calls, others are scratch.

PR0...PR15 are static PR16...PR63 are rotating

In Win32 calling convention PR0...PR5 are preserved while PR6...PR63 are scratch

BR0 in Win32 calling convention is the return address, it is automatically set when br.call is executed.

In Win32 calling convention BR1...BR5 are preserved while BR6 and BR7 are scratch.

BSP is a Application Register (AR) which is called ia64's second stack pointer, which grows downwards as opposed to the normal stack which grows upwards, and it's used to store register states from long ago, I speculate that this is where the RSE (Register Stack Engine) saves registers in case of a allocation requiring more registers than are available.


Stops are used as a indication that the instruction after the stop relies on data that may have been processed in the instructions before the stop, which means the instructions that are before the stop can be executed in parallel.

A sequence of instructions without a single stop is called a instruction group

  • Exceptions to the 'no dependencies in an instruction group' are that branch instructions are allowed to depend on PRs and or BRs set up earlier
  • The result of a successfull ld.c is allowed without a stop
  • Whatever this means: "Comparison instructions .and, .andcm, .or, and .orcm are allowed to combine with others of the same type into the same targets. (In other words, you can combine two .ands, but not an .and and an .or.)"
  • Writing to registers read previously is allowed
  • 2 instructions in the same group are not allowed to write to the same register

CONCEPTUAL SO FAR On entry to a function, assuming the function takes in 2 parameters, because starting at GR32 the stacked registers begin, this is where function parameters go, GR32 is parameter 1, GR33 is parameter 2, immediately afterwards are the private local registers, assuming the function requires 4 registers for private use GR34, GR35, GR36, GR37 would be local registers, after those come the output registers, lets assume the function wants to call a function which takes in 3 parameters, it would put those into registers R38, R39 and R40, so it needs to be accounted for what sort of functions the function is calling to allocate enough register to be able to hold the outputs of the functions its calling.

Input and Local Registers are collectively called the local region, the Input and Local and Output registers are collectively known as the register frame.

Any registers higher than the last output register are off limits to the function, they do not exist and trying to access them is disallowed.

the alloc instruction takes in first in what register to store the previous register frame state, how many input registers, how many local registers, and how many output registers and lastly how many rotating registers to allocate for the function.

Afterwards the return address is immediately set as such mov r<x> = b0

stopped here: image

on 3

END OF CONCEPTUAL

Sources

The Itanium processor, part 1: Warming up The Itanium processor, part 2: Instruction encoding, templates, and stops The Itanium processor, part 3: The Windows calling convention, how parameters are passed The Itanium processor, part 3b: How does spilling actually work? The Itanium processor, part 4: The Windows calling convention, leaf functions The Itanium processor, part 5: The GP register, calling functions, and function pointers The Itanium processor, part 6: Calculating conditionals The Itanium processor, part 7: Speculative loads The Itanium processor, part 8: Advanced loads The Itanium processor, part 9: Counted loops and loop pipelining The Itanium processor, part 10: Register rotation

Eeveelution avatar Nov 08 '22 19:11 Eeveelution