Cataclysm-DDA icon indicating copy to clipboard operation
Cataclysm-DDA copied to clipboard

Add `resolved_id<T>` to represent IDs that are already resolved.

Open prharvey opened this issue 2 years ago • 4 comments

Summary

Infrastructure "Add resolved_id<T> to represent IDs that are already resolved."

Purpose of change

See issue #70399

Describe the solution

Introduce resolved_id which is a thin wrapper around a pointer, and acts as a companion to int_id and string_id. It avoids the branching and indirection in resolving an id. The branch is almost always true and is easily predicted, so the main win is removing a layer of indirection and generating less code (no debug message!) in an extremely common function.

As this requires the objects in generic_factory to not move in memory, the internal vector was swapped for deque. This is why 100+ files are touched. The downside is that this adds a layer of indirection to int_id/string_id look ups, but performance sensitive code (and global hardcoded ids) will be migrated to use resolved_id.

This also adds two uses of the class: resolved_ter_id and resolved_furn_id. Replaced some int_id usage with these new types. As a side effect, also fixed a source of potential bugs: there were a few static maps that saved int_ids, but those are not stable between loads. They now store string_ids and are stable.

Finally, added a clang-tidy check to warn about global resolved_ids, as they are not stable between loads.

Describe alternatives you've considered

Refactor int_id to ensure it is always valid. There are way too many cases of constructing it from an int for this to be feasible, or even desirable.

Testing

Added tests for resolved_id, and the new clang-tidy check.

Additional context

These were taken with #70274 and #70546 merged, as they add a lot of noise. Two samples before and after, since the profiles are now much more stable.

Before: Screenshot 2024-01-01 124026 Screenshot 2024-01-01 124324

After: Screenshot 2024-01-01 130308 Screenshot 2024-01-01 130522

Fun fact: with MSVC the resulting binary is 30kB smaller. That isn't a whole lot, but since it almost entirely consists of replacing setting up registers + a call instruction for int_id::obj() with a single pointer dereference, it's kind of crazy how often this function is called with just terrain and furniture.

prharvey avatar Dec 26 '23 07:12 prharvey

We probably don't want to use deque here. On MSVC deque is roughly equivalent to list in that it allocates every object separately, which is rather wasteful. I'd suggest colony instead (that's a container we have in this repo; not a standard library one).

jbytheway avatar Jan 09 '24 05:01 jbytheway

We probably don't want to use deque here. On MSVC deque is roughly equivalent to list in that it allocates every object separately, which is rather wasteful. I'd suggest colony instead (that's a container we have in this repo; not a standard library one).

I'm aware of the issue with MSVC deque, although at the time of writing it was my understanding that MSVC was for developers only and not the version we cared about. After discussing it on discord, apparently the plan is to move to MSVC for the windows build (which I think is a mistake), but it is also why I re-drafted this PR.

I think even in the degenerate case where it's an element per block it's probably "fine" since that's basically a vector of unique_ptr, which is the alternative (well, the alternative that doesn't require us to implement a new container). I'm not super familiar with colony, but from a quick look it doesn't appear to support O(1) random access, so it's unsuitable as it's a requirement for int_id lookups (this is not meant to be a replacement for them).

I was thinking of taking this in a different direction and just leaving it as a vector, and make resolved ID essentially just an int ID that was already verified to be valid, but I'm busy at the moment.

prharvey avatar Jan 09 '24 07:01 prharvey

After discussing it on discord, apparently the plan is to move to MSVC for the windows build (which I think is a mistake)

I used to push for that move, but now I'm more ambivalent. I mostly wanted it for better debug stack traces on Windows, but I think that might have been resolved another way since?

Still, I think we should avoid doing anything that would be bad on MSVC.

I think even in the degenerate case where it's an element per block it's probably "fine" since that's basically a vector of unique_ptr, which is the alternative

It would only hurt for cases where we want to iterate over all the elements, which happens, but I guess it's not too common.

it doesn't appear to support O(1) random access, so it's unsuitable

Yeah, I think you're right; colony is probably not a great fit either. It has log(n) random access, which is probably not good enough.

I was thinking of taking this in a different direction and just leaving it as a vector, and make resolved ID essentially just an int ID that was already verified to be valid

That sounds reasonable.

jbytheway avatar Jan 10 '24 00:01 jbytheway

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Please do not bump or comment on this issue unless you are actively working on it. Stale issues, and stale issues that are closed are still considered.

github-actions[bot] avatar Feb 20 '24 21:02 github-actions[bot]