zig icon indicating copy to clipboard operation
zig copied to clipboard

re-order struct fields for better performance and memory usage

Open andrewrk opened this issue 9 years ago • 15 comments

In Zig, you have no guarantee about the memory layout of a struct unless you make a struct packed or export.

Take advantage of this:

  • [ ] align fields appropriately for fast access. Allow explicitly setting the minimum alignment per field.
  • [ ] re-order smaller fields to fit in alignment gaps for smaller memory usage
  • [ ] gaps created by maybe types, error unions, and other containers can be filled by smaller types
  • [ ] put large fixed size arrays at the end for better cache performance of the rest of the struct fields
  • [ ] to enable the same function code to work for multiple specialized data types. For example, a field that uses a parameterized type might be moved to the last position, so that functions which only reference fields before that one can share the same codegen.

Provide safety against incorrect assumptions about field order:

  • [x] provide builtin function @container_base(inline T: type, field_name_symbol, field_pointer) -> &T (better name suggestion welcome @thejoshwolfe). This function takes a pointer to a container field, the container type, and a symbol which is the name of the field, and returns a pointer to the struct. Programmers should use this builtin instead of futzing around with pointers to go from a field pointer to the container base pointer.
  • [ ] in debug mode, order fields backwards from declaration order. This ensures that an incorrect assumption will not work in debug mode. This safety trick may turn out to be too expensive. If so, we will do the same field ordering logic in debug mode and release mode.

andrewrk avatar Aug 03 '16 17:08 andrewrk

Don't you think it would be better to let the programmer decide the packing order of the fields? If the compiler decides how to pack stuff, working programs may break from one compiler version to another. I don't think the complexity and risk this optimization adds is worth it. If you still want to add it, you can provide a special annotation to indicate that the compiler should decide how to pack the struct.

felipecrv avatar Sep 03 '16 19:09 felipecrv

The programmer has a choice: leave the field order undefined and give the power to the compiler, or use a packed struct and specify the order themselves. Packed structs are not done yet. See #183

andrewrk avatar Sep 05 '16 05:09 andrewrk

So if I need a guarantee on the order of a structure it has to be packed, right?

If so you'd probably need a chapter of documentation describing how to properly align field in structs to avoid (or at least mitigate) people complaining about segfaults :)

On Mon, Sep 5, 2016, 07:29 Andrew Kelley [email protected] wrote:

The programmer has a choice: leave the field order undefined and give the power to the compiler, or use a packed struct and specify the order themselves. Packed structs are not done yet. See #183 https://github.com/andrewrk/zig/issues/183

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/andrewrk/zig/issues/168#issuecomment-244661576, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMxqYFC5UHwisC5e5fWQ664YNSloSNiks5qm6ipgaJpZM4Jb6L_ .

kiljacken avatar Sep 05 '16 06:09 kiljacken

The programmer has a choice: leave the field order undefined and give the power to the compiler, or use a packed struct and specify the order themselves. Packed structs are not done yet. See #183

Yeah. But don't make undefined the default behavior. You still going to have padding of course, but the order of fields will be guaranteed.

felipecrv avatar Sep 05 '16 10:09 felipecrv

So if I need a guarantee on the order of a structure it has to be packed, right?

Correct.

There is also the concept of export or extern structs which have the C rules about layout and these are guaranteed to be compatible with other C code.

If so you'd probably need a chapter of documentation describing how to properly align field in structs to avoid (or at least mitigate) people complaining about segfaults :)

Can you explain a situation that would result in a segfault because of code relying on field order?

andrewrk avatar Sep 05 '16 16:09 andrewrk

don't make undefined the default behavior. You still going to have padding of course, but the order of fields will be guaranteed.

Can you explain the use case where you want to rely on field order?

andrewrk avatar Sep 05 '16 16:09 andrewrk

x86 will segfault upon doing an unaligned memory access (typically aligned to data size). Imagine a structure like this:

struct Gizmo {
    tag: u8,
    thingy: u16
}

Assuming allocated memory is aligned, dereferencing the field thingy of a pointer to a struct that's a Gizmo will result in a segfault.

There's a few details with regard to the load/store instructions used, SSE load/store segfault, whereas (iirc) the normal mov "just" suffers performace wise due to splitting into several aligned loads/stores.

kiljacken avatar Sep 05 '16 21:09 kiljacken

don't make undefined the default behavior. You still going to have padding of course, but the order of fields will be guaranteed.

Can you explain the use case where you want to rely on field order?

Most of the use cases can be considered hacks. Considering that we're talking about a low-level language, these hacks are use cases. If the compiler defines the order of the the fields it becomes hard to analyze memory dumps (there may be many situations where raw memory is the only thing you have to debug something), appending a field to a struct can lead to a reordering of the fields that existed before. This gives up the ability to write code that deals with data that was generated based on the previous struct specification.

Maybe I'm being too pessimistic here, but my impression is that this complexity would backfire in unpredictable ways if you have it by default.

felipecrv avatar Sep 08 '16 22:09 felipecrv

Maybe I'm being too pessimistic here, but my impression is that this complexity would backfire in unpredictable ways if you have it by default.

If that happens we can always change the decision. And code that does not assume field order will continue to work correctly if it can then start assuming field order.

Also there will be a builtin function for getting the address of a field based on its name. Example: @getFieldPointer(struct_instance, "field_name"). Note that "field_name" will have to be a compile time constant.

andrewrk avatar Sep 11 '16 18:09 andrewrk

Is this something where the field enumeration in #383 would help?

Note for the last part, is this not the same as getting the address of the field in the strut via &?

kyle-github avatar Sep 16 '17 01:09 kyle-github

Note for the last part, is this not the same as getting the address of the field in the strut via &?

See http://ziglang.org/documentation/#builtin-fieldParentPtr

andrewrk avatar Sep 16 '17 17:09 andrewrk

This issue will now have a meaningful impact on async functions since it applies to all local variables.

andrewrk avatar Aug 15 '19 22:08 andrewrk

I mentioned in discord but it would be useful if you could tag certain fields to be omit from reordering but let zig do whatever it wants with rest. At top of the struct I'm storing small tags which I take the address of, the tags contain the runtime information about the type of the struct the tag is part of and offset to the struct's base. I want to keep this tag small so u8 for offsets would've been fine if not struct reordering. I ended up storing field index instead and use inline for to find the offset of the field, which also is fine but adds at least small cost of switch.

Extern struct is not applicable here because I actually do want the zig struct semantics for rest of the fields and also not have the extern struct must only contain extern struct restriction.

Cloudef avatar Jun 07 '24 06:06 Cloudef

One reason for having the option of known ordering of a struct without much overhead is being able to use guard integrity fields of known bytes surrounding the struct that you can see in the memory dump. This is especially important when debugging embedded systems with very little resources. This helps detect faults that overwrite portions of memory.

It is also very useful to read the structs out of memory of a running system. But with unknown ordering, it makes that impossible.

If there was an easy option to turn the reordering off without having to define all the alignment with a packed struct, that would be helpful. It might not even need to be per struct but a compile time option.

MikeColeGuru avatar Jul 08 '24 04:07 MikeColeGuru

@fieldParentPtr would be also a no-op if I could specify the struct field to be the first field in memory.

But automatic ordering is also nice in a few cases where I don't care about it that much.

kocsis1david avatar Aug 21 '24 19:08 kocsis1david