semian icon indicating copy to clipboard operation
semian copied to clipboard

SysV Shared memory module

Open kris-gaudel opened this issue 1 month ago • 1 comments

Part of https://github.com/orgs/Shopify/projects/13332?pane=issue&itemId=139784715&issue=Shopify%7Cresiliency%7C6646

This implements the first two parts of the issue (Sysv shared memory and atomic ops). PID state updates will be in a follow-up PR

Instructions to build the extension and run the unit test (after initializing the container):

  1. podman-compose -f .devcontainer/docker-compose.yml exec semian bash to get into the container
  2. bundle exec rake build to compile
  3. bundle exec ruby -Ilib:test test/shared_memory_test.rb to run the unit test for the extension

kris-gaudel avatar Nov 19 '25 03:11 kris-gaudel

Summary

Here is a summary of the conversation:

  • Kris proposed a detailed architecture for implementing shared memory and PID update logic. The plan involves the first process creating a shared memory segment and semaphore, with subsequent processes attaching to them.
  • To manage PID updates, Kris suggested that each process run a background thread. This thread would attempt to acquire a mutex at the end of each time window to update the PID controller's values, such as the rejection_rate, in shared memory.
  • To prevent redundant updates from multiple processes, Kris's design includes a timestamp. A process will only perform an update if a certain amount of time has passed since the last update.
  • Anagayan expressed concern about each process creating a background thread, suggesting an alternative where updates are triggered on a small fraction of incoming requests instead. Abdulrahman noted this approach could be blocked by slow requests, making the sliding window less effective.
  • Anagayan also raised the issue of cleaning up shared memory when processes crash unexpectedly, suggesting there might be a trade-off where cleanup only happens on a full pod restart. Abdulrahman requested that Kris's research be documented in GitHub for future reference.

Links

Internal

This summary was generated automatically using Google's gemini-2.5-pro (temperature 0.5).

:thread: Slack Thread
User Message
Kris Gaudel
I spent a lot of time thinking today about how the shared memory and PID update logic will be implemented (notes and drawings here). The description is in the :thread:
Kris Gaudel
*How the semaphore is initialized in the bulkhead* 1. Process 1 will call semget to try to create the semaphore, since the semaphore doesn't exist it does not return EEXIST, thus creating it (sem_id = semget...) 2. Process 2 will call semget , but it returns EEXIST, it sets its sem_id to that of the one that exists( sem_id= wait_for_new_semaphore_set(res->key, permissions); )
Kris Gaudel
*We can follow the same logic but for our shared memory* 1. P1 tries to create shared memory, it doesn't already exist so shm_id = shmget(...) 2. P2 tries to create shared memory, it already exists so shm_id = shmctl(...)
Kris Gaudel
*Stepping back, how will we our ACBs connect to this shared memory?* • Each process will call Semain.register(...) , initializing an instance of an ACB • The first process will try to create the shared memory as described above, the second will attach to the shared memory
Kris Gaudel
*Ok, but now how will UpdatePID get called?* • Currently, each process has its own background thread that calls UpdatePID at the end of every window • We can employ similar logic where each process has its own background thread that will try and acquire a mutex upon reaching the end of its window • If it gets the mutex, it can call UpdatePID
Kris Gaudel
*Great, but what if we have a case where P1 updates the PID, then P2 (who's window just finished) tries to update the PID even though it was just updated?* • Timestamps! A timestamp of the last time the PID was updated • Our UpdatePID function will update these things in our shared memory: rejection_rate and update_timestamp • Processes looking to update the PID have to pass this condition: If time.now() - last_update we skip the update since it was very recently updated (edited)
Kris Gaudel
*How will shared memory get cleaned up* • Reference counting • When processes shut down they will reduce the reference count of the shared memory by 1 (ref_count-=1 ) • If the last process wants to shut down, it will check the reference count before shutting down • Last process calls destructor to clean up shared memory
Kris Gaudel
New architecture and diagram to show inheritance
image.png image.png
Kris Gaudel
Happy to pair on this tomorrow and clarify things
Abdulrahman Alhamali
wow, I really like how detailed this is. Overall it seems solid to me, and I really like the idea of all of the processes having their PID controller. Because if we kept it to a single process, that process might die, and we end up without a controller
Anagayan Ariaran
Hey great work thinking through all this! The last update timestamp is definitely a key value to track.
each process has its own background thread
Just thinking about this, if each process has a thread that's constantly checking we are more likely to encounter. I also have concerns around opening threads to do this process within applications that call Semian. We don't know the architecture of others and opening threads that just check on this might not be a great pattern.

To combat this I was thinking this can be considered on a fraction of requests coming through, like say on 5% of requests after a time window. On that fraction they try to update the values by acquiring the semaphore. It's more on the hot path, but it's not every request, assuming that the actual calculations themselves won't take too long in general. (reference)

*How will shared memory get cleaned up*
Great work thinking this through on normal operations. I think the hard part to consider is what we do when processes crash or are killed and whether we can account for those unexpected events. There might be a tradeoff here where we can't account for everything but know that things will get cleaned up when pods restart, and the total volume of memory we use is marginal.
Abdulrahman Alhamali
To combat this I was thinking this can be considered on a fraction of requests coming through, like say on 5% of requests after a time window. On that fraction they try to update the values by acquiring the semaphore. It's more on the hot path, but it's not every request, assuming that the actual calculations themselves won't take too long in general. (<a href="https://draw.shopify.io/dtcmrhusnppj5ourmdddbpl2?d=v-1039.-1282.3598.3777.nzTtLG_aXBOO-JtR8phRE">reference</a>)
this might be useful for the current implementation that we have of the pid controller. Some people might not be fans of the fact that we are creating a thread. The problem though with doing it after requests is that requests with slow queries would end up blocking it, so the sliding window implementation would become less effective. However, this becomes less of an issue if we have many processes, so it's mostly an issue for the current implementation
Abdulrahman Alhamali
Kris, let's make sure your research gets amortized in github
Kris Gaudel
Thanks for the feedback everyone!
Kris Gaudel
I haven’t used GitHubArchiver before, it’d be useful to refer to this stuff somewhere

Kris Gaudel archived this conversation from gsd-47642-semian-autoconfig at . All times are in UTC.