mojo
mojo copied to clipboard
Proposal For An Actor System Based On Mojo
This is currently a work in progress. There are no code changes, just a proposal written in the proposals section. This was pre-approved by Chris Lattner in a conversation in June 2023.
I will keep working on this as I have time, but it is far enough along that I'm looking for feedback and assistance from interested parties.
I will take it out of draft mode when its a little further along.
Signed-off-by: Reid Spencer [email protected]
This is cool reid, thank you for putting this together. We're quite a bit too early to invest in this area IMO (we need to get traits much further along and complete lifetimes) but I think this is a very likely long term direction. If you're interested, Actors got built into swift with a more complex model than was in the manifesto, you can read about it here, or on several swift-evolution proposals: https://docs.swift.org/swift-book/documentation/the-swift-programming-language/concurrency#Actors
I do hope we can eschew the complexity, a lot of it is due to legacy interop with apple frameworks. OTOH we may need such things to work with legacy python and other libs though.
@lattner - I understand the language earliness, but I think there's value in starting the Actor's project early. So, I have started already: https://github.com/ossuminc/moxy (extremely nascent). In the proposal, I've tried to minimize the requirements on Mojo. The recent introduction of Traits allowed me to get started. All else that is needed is default implementations in the traits. Later on, when Mojo has matured, it would be interesting to integrate an ASIC or GPU to help with extremely fast message dispatch. All in good time. I'm happy to start this work without the involvement of Modular's time/resources; at least for now.
Thanks for the reference to the Actor implementation in Swift. I am examining several actor systems to try and glean the winning strategies from their patterns. Akka is my strongest entry knowledge, but I'm open to merging the best ideas from other ecosystems.
I plan to leave interoperability to the end of actor development and not sacrifice simplicity or performance. In other words, interoperability will have its own complexity and costs, as an add-on.
Going to close this PR for now since there hasn't been much activity on this in several months. Feel free to reopen when we're ready to take on this kind of work.
@Brian-M-J I've looked into the senders and receivers proposal. I see people promising big things ("a global solution to concurrency"), but notably, they don't seem to be able to back up their claims with evidence. I would expect to see an example of a non-trivial concurrent program that is dramatically easier to write with senders as opposed to being written with async/await and tasks etc. All I see are toy examples.
I am really skeptical that something big has been discovered. If it had, more people would have noticed by now. Senders and receivers have been around for 4+ years.
I guess talks like this would be good demonstrations at least.
That's another toy example. All he's done is create an event loop that spawns asynchronous tasks one-at-a-time. This is trivial to do in any language with async/await and a Task
type.
I would love if S&R has solved some major problems with modelling concurrent systems, but I don't see it.
At the risk of stating the obvious, there are a lot of projects out there developed by well-meaning, passionate people, who promise that they have created something important. But most of the time, that doesn't turn out to be the case. I've been burned a lot in the past by believing that a project is as important as the contributors say it is, and then I've begun to experiment with it, only to eventually discover that I've wasted my time.
Getting rid of function coloring is a worthy goal, no doubt about that. But that is orthogonal to S&R. The reason most PLs have colored functions is because the thread-based concurrency model that they had already implemented prior to implementing async/await is incompatible with implicit suspension and implicit migration of tasks between threads. In the case of Python, the main reason you can't implicitly switch tasks is because a Python program is a big soup of shared mutable state with no synchronization/critical sections, so two tasks can easily race each other.
The solution to this is to come up with a concurrency model that ensures tasks can't race each other. You can get most of the way there with a Rust-like borrowing system. On top of that, you'd want a way to perform transactions on shared state. This is a big design space worth exploring. S&R doesn't really have a solution here. (I'd like to see an example of multiple tasks concurrently printing to stdout
using S&R.)
I've seen that. The idea is that if you can statically identify all of the places in your codebase where a variable is being accessed by multiple tasks—and if at least one of those tasks mutates the variable—then you (or your compiler) can conceivably restructure the program (cut it up into subtasks) such that any time a task needs to access the shared variable, you defer the access to a scheduler/executor. (And the executor contains the synchronization primitives required to avoid data races.)
This is a good observation, and I strongly agree that a task-based concurrency model should aim to do this. (Concurrency without explicit locking would be amazing!) But this is—again—completely orthogonal to S&R. S&R doesn't give me a simple way to write that restructured program, from what I can tell.
Actually, I'd only read the second article you linked, but the first article is more interesting IMO, because it actually discusses the "program restructuring" problem I'm referring to:
Let’s assume that one has an application that can be broken down in tasks relatively easily. But, at some point, deep down in the execution of a task one would need a lock. Ideally one would break the task in 3 parts: everything that goes before the lock, the protected zone, and everything that goes after. Still, easier said than done; this can be hard if one is 20 layers of function-calls – it’s not easy to break all these 20 layers of functions into 3 tasks and keep the data dependencies correct. If breaking the task into multiple parts is not easily doable, then one can also use the fork-join model to easily get out of the mess.
This proposed solution—forking a new task to mutate the shared variable—makes a lot of sense. It's worth exploring further.
@nmsmith maybe take a look at how NVIDIA leverages P2300 with CUDA to do async computation on GPUs? https://www.youtube.com/watch?v=nwrgLH5yAlM