HelloSilicon icon indicating copy to clipboard operation
HelloSilicon copied to clipboard

Fix alignment in chapter 1 example code

Open aturley opened this issue 3 years ago • 1 comments

Change .align 2 to .align 4.

The README.md tells users to change .align 2 to .align 4, but the change had not been made in this file.

aturley avatar Jan 14 '22 16:01 aturley

This confused me too. Stephen Smith's own blog says:

  • In MacOS the program must start on a 64-bit boundary, hence the listing has an “.align 2” directive near top.

My understanding from gnu-as docs (which may differ from llvm-as) is that .p2align works as follows:

.p2align 0 => 1 byte boundary
.p2align 1 => 2 byte boundary
.p2align 2 => 4 byte boundary
.p2align 3 => 8 byte boundary
.p2align 4 => 16 byte boundary
...

Since a byte is 8 bits, a 64 bit boundary would therefore be equivalent to a 64/8 = 8 byte boundary, i.e. from the table above, that require a .p2align 3 rather than a .p2align 2.

I don't know if the mistake is in Stephen's blog (perhaps it should say "32 bit boundary" rather than "64-bit boundary") or perhaps the code here should have .p2align 3 rather than .p2align 2. Some clarity would be useful though. I do see that the README.md says either .align 2 or .p2align 4 can be used, so I don't think .align 4 is intended (i.e. the change in this PR doesn't seem quite right).

One further point of personal confusion: I thought, at least in gnu-as for aarch64, .align was equivalent to .p2align - but perhaps I'm wrong. I do know that on some platforms it refers to the boundary in terms of number of bytes (i.e. an 8 byte boundary would be .align 8 for some architectures) - but I thought this wasn't the case for aarch64. Certainly the semantics of .align vary between architecture under gnu-as, which is probably why .palign and .p2align were created, presumably to standardise semantics across architectures.

Whatever the reason for it all, perhaps a word or two could be spared in the readme to explain the difference between .align and .p2align, and qualify if it really needs to be a 64 bit boundary, or if a 32 bit boundary is sufficient (and then Stephen could also be informed if he needs to correct his blog post too).

Thanks Alexander and Andrew!

petemoore avatar Jul 15 '22 08:07 petemoore