SysV Shared memory module

Open kris-gaudel opened this issue 1 month ago • 1 comments

Part of https://github.com/orgs/Shopify/projects/13332?pane=issue&itemId=139784715&issue=Shopify%7Cresiliency%7C6646

This implements the first two parts of the issue (Sysv shared memory and atomic ops). PID state updates will be in a follow-up PR

Instructions to build the extension and run the unit test (after initializing the container):

podman-compose -f .devcontainer/docker-compose.yml exec semian bash to get into the container
bundle exec rake build to compile
bundle exec ruby -Ilib:test test/shared_memory_test.rb to run the unit test for the extension

Nov 19 '25 03:11 kris-gaudel

Summary

Here is a summary of the conversation:

Kris proposed a detailed architecture for implementing shared memory and PID update logic. The plan involves the first process creating a shared memory segment and semaphore, with subsequent processes attaching to them.
To manage PID updates, Kris suggested that each process run a background thread. This thread would attempt to acquire a mutex at the end of each time window to update the PID controller's values, such as the rejection_rate, in shared memory.
To prevent redundant updates from multiple processes, Kris's design includes a timestamp. A process will only perform an update if a certain amount of time has passed since the last update.
Anagayan expressed concern about each process creating a background thread, suggesting an alternative where updates are triggered on a small fraction of incoming requests instead. Abdulrahman noted this approach could be blocked by slow requests, making the sliding window less effective.
Anagayan also raised the issue of cleaning up shared memory when processes crash unexpectedly, suggesting there might be a trade-off where cleanup only happens on a full pod restart. Abdulrahman requested that Kris's research be documented in GitHub for future reference.

Links

Internal

This summary was generated automatically using Google's gemini-2.5-pro (temperature 0.5).

:thread: Slack Thread

User		Message
	^{Kris Gaudel 2025‑11‑21 02:06}	I spent a lot of time thinking today about how the shared memory and PID update logic will be implemented (notes and drawings here). The description is in the :thread:
	^{Kris Gaudel 2025‑11‑21 02:09}	How the semaphore is initialized in the bulkhead 1. Process 1 will call `semget` to try to create the semaphore, since the semaphore doesn't exist it does not return `EEXIST`, thus creating it (`sem_id = semget...`) 2. Process 2 will call `semget` , but it returns `EEXIST`, it sets its `sem_id` to that of the one that exists( `sem_id= wait_for_new_semaphore_set(res->key, permissions);` )
	^{Kris Gaudel 2025‑11‑21 02:11}	We can follow the same logic but for our shared memory 1. P1 tries to create shared memory, it doesn't already exist so `shm_id = shmget(...)` 2. P2 tries to create shared memory, it already exists so `shm_id = shmctl(...)`
	^{Kris Gaudel 2025‑11‑21 02:14}	Stepping back, how will we our ACBs connect to this shared memory? • Each process will call `Semain.register(...)` , initializing an instance of an ACB • The first process will try to create the shared memory as described above, the second will attach to the shared memory
	^{Kris Gaudel 2025‑11‑21 02:15}	Ok, but now how will `UpdatePID` get called? • Currently, each process has its own background thread that calls `UpdatePID` at the end of every window • We can employ similar logic where each process has its own background thread that will try and acquire a mutex upon reaching the end of its window • If it gets the mutex, it can call `UpdatePID`
	^{Kris Gaudel 2025‑11‑21 02:51}	Great, but what if we have a case where P1 updates the PID, then P2 (who's window just finished) tries to update the PID even though it was just updated? • Timestamps! A timestamp of the last time the PID was updated • Our `UpdatePID` function will update these things in our shared memory: `rejection_rate` and `update_timestamp` • Processes looking to update the PID have to pass this condition: If `time.now() - last_update we skip the update since it was very recently updated (edited)`
	^{Kris Gaudel 2025‑11‑21 02:52}	How will shared memory get cleaned up • Reference counting • When processes shut down they will reduce the reference count of the shared memory by 1 (`ref_count-=1` ) • If the last process wants to shut down, it will check the reference count before shutting down • Last process calls destructor to clean up shared memory
	^{Kris Gaudel 2025‑11‑21 02:55}	New architecture and diagram to show inheritance
	^{Kris Gaudel 2025‑11‑21 02:56}	Happy to pair on this tomorrow and clarify things
	^{Abdulrahman Alhamali 2025‑11‑21 14:00}	wow, I really like how detailed this is. Overall it seems solid to me, and I really like the idea of all of the processes having their PID controller. Because if we kept it to a single process, that process might die, and we end up without a controller
	^{Anagayan Ariaran 2025‑11‑21 14:28}	Hey great work thinking through all this! The last update timestamp is definitely a key value to track. each process has its own background thread Just thinking about this, if each process has a thread that's constantly checking we are more likely to encounter. I also have concerns around opening threads to do this process within applications that call Semian. We don't know the architecture of others and opening threads that just check on this might not be a great pattern. To combat this I was thinking this can be considered on a fraction of requests coming through, like say on 5% of requests after a time window. On that fraction they try to update the values by acquiring the semaphore. It's more on the hot path, but it's not every request, assuming that the actual calculations themselves won't take too long in general. (reference) How will shared memory get cleaned up Great work thinking this through on normal operations. I think the hard part to consider is what we do when processes crash or are killed and whether we can account for those unexpected events. There might be a tradeoff here where we can't account for everything but know that things will get cleaned up when pods restart, and the total volume of memory we use is marginal.
	^{Abdulrahman Alhamali 2025‑11‑21 14:31}	To combat this I was thinking this can be considered on a fraction of requests coming through, like say on 5% of requests after a time window. On that fraction they try to update the values by acquiring the semaphore. It's more on the hot path, but it's not every request, assuming that the actual calculations themselves won't take too long in general. (<a href="https://draw.shopify.io/dtcmrhusnppj5ourmdddbpl2?d=v-1039.-1282.3598.3777.nzTtLG_aXBOO-JtR8phRE">reference</a>) this might be useful for the current implementation that we have of the pid controller. Some people might not be fans of the fact that we are creating a thread. The problem though with doing it after requests is that requests with slow queries would end up blocking it, so the sliding window implementation would become less effective. However, this becomes less of an issue if we have many processes, so it's mostly an issue for the current implementation
	^{Abdulrahman Alhamali 2025‑11‑21 15:35}	Kris, let's make sure your research gets amortized in github
	^{Kris Gaudel 2025‑11‑21 15:59}	Thanks for the feedback everyone!
	^{Kris Gaudel 2025‑11‑21 16:00}	I haven’t used GitHubArchiver before, it’d be useful to refer to this stuff somewhere

Kris Gaudel archived this conversation from gsd-47642-semian-autoconfig at 2025‑11‑21 16:46. All times are in UTC.

Nov 21 '25 16:11 shopify-github-archiver[bot]