Rosalia64
Rosalia64 copied to clipboard
Research: Findings from the Microsoft Blogs
GR0
similar to PR0
is hardwired to 0 (PR0 is hardwired to 1) and writing to it triggers a processor exception.
GR1
is called the Global Pointer and points to the current function's global variables because Itanium has no absolute addressing mode.
In the Win32 calling convention for Itanium GR8...GR11
are used for return values
GR12
is the stack pointer (unknown if Itanium generally or Win32 only)
the NotAThing
bit is used for speculative execution to indicate the Value of the register isn't valid yet. Accessing such registers in for example arithmetic operations will spread the NotAThing bit to other Registers aswell, and alot of instructions disallow NotAThing'ed registers meaning uninitialized variable access could lead to a program crash.
FR0
is hardwired to 0.0
FR1
is hardwired to 1.0
same as GR
's GR/FR 0 through 31 are static, 32 to 127 are rotating. Through the Win32 calling convention however FR0...FR5
and FR16...FR31
are preserved across calls, others are scratch.
PR0...PR15
are static PR16...PR63
are rotating
In Win32 calling convention PR0...PR5
are preserved while PR6...PR63
are scratch
BR0
in Win32 calling convention is the return address, it is automatically set when br.call
is executed.
In Win32 calling convention BR1...BR5
are preserved while BR6
and BR7
are scratch.
BSP
is a Application Register (AR) which is called ia64's second stack pointer, which grows downwards as opposed to the normal stack which grows upwards, and it's used to store register states from long ago, I speculate that this is where the RSE (Register Stack Engine) saves registers in case of a allocation requiring more registers than are available.
Stops are used as a indication that the instruction after the stop relies on data that may have been processed in the instructions before the stop, which means the instructions that are before the stop can be executed in parallel.
A sequence of instructions without a single stop is called a instruction group
- Exceptions to the 'no dependencies in an instruction group' are that branch instructions are allowed to depend on PRs and or BRs set up earlier
- The result of a successfull
ld.c
is allowed without a stop - Whatever this means: "Comparison instructions .and, .andcm, .or, and .orcm are allowed to combine with others of the same type into the same targets. (In other words, you can combine two .ands, but not an .and and an .or.)"
- Writing to registers read previously is allowed
- 2 instructions in the same group are not allowed to write to the same register
CONCEPTUAL SO FAR
On entry to a function, assuming the function takes in 2 parameters, because starting at GR32
the stacked registers begin, this is where function parameters go, GR32
is parameter 1, GR33
is parameter 2, immediately afterwards are the private local registers, assuming the function requires 4 registers for private use GR34
, GR35
, GR36
, GR37
would be local registers, after those come the output registers, lets assume the function wants to call a function which takes in 3 parameters, it would put those into registers R38
, R39
and R40
, so it needs to be accounted for what sort of functions the function is calling to allocate enough register to be able to hold the outputs of the functions its calling.
Input and Local Registers are collectively called the local region, the Input and Local and Output registers are collectively known as the register frame.
Any registers higher than the last output register are off limits to the function, they do not exist and trying to access them is disallowed.
the alloc
instruction takes in first in what register to store the previous register frame state, how many input registers, how many local registers, and how many output registers and lastly how many rotating registers to allocate for the function.
Afterwards the return address is immediately set as such mov r<x> = b0
stopped here:
on 3
END OF CONCEPTUAL
Sources
The Itanium processor, part 1: Warming up The Itanium processor, part 2: Instruction encoding, templates, and stops The Itanium processor, part 3: The Windows calling convention, how parameters are passed The Itanium processor, part 3b: How does spilling actually work? The Itanium processor, part 4: The Windows calling convention, leaf functions The Itanium processor, part 5: The GP register, calling functions, and function pointers The Itanium processor, part 6: Calculating conditionals The Itanium processor, part 7: Speculative loads The Itanium processor, part 8: Advanced loads The Itanium processor, part 9: Counted loops and loop pipelining The Itanium processor, part 10: Register rotation