SetReplace
SetReplace copied to clipboard
Parallelize deterministic WolframModel on GPU
The problem
This is similar to #155, but for GPUs rather than CPUs. We need both because some users might not have access to a GPU, especially if we don't support all of them.
This issue specifically is for a parallelization supporting a specific "EventOrderingFunction", i.e., it should not change the output of WolframModel in any way. There is another issue (TODO: add issue) for parallelizing undefined-order evolution.
Additional context
Feel free to create subtasks for different GPU frameworks.
If this is C++, it maybe possible to turn this to CUDA code. Could you kick towards the right place in the codebase?
I can see a few "EventOrderingFunction" in the .wlt and in .m. but not in C++. And then there is libSetReplace, which seem like a candidate to turn to CUDA. But it is not clear what exactly needs parallelization / GPU.
@dchichkov, yes, libSetReplace is the "LowLevel" C++ implementation of the Wolfram model evolution. "EventOrderingFunction" corresponds to the OrderingSpec in the C++ code. Although I think it will probably be easier to start with #354, which does not put any constraints on the event order.
The question is whether we can run the entire Wolfram model evolution on a GPU. This starts with Set::replace, and the heaviest code is in the Matcher class. The basic algorithm is briefly discussed here.
There is a lot to explain though (and I don't know much about GPU programming myself, so I'm not quite sure what's possible). If you'd like to help, it would be great if you join our Discord server. We are also hosting a Q&A at 10 AM Pacific Time tomorrow (10/10). If you can join, we can discuss it there (or any other time after that).
And thanks for your interest!
@maxitg Nice. Thank you for kicking me towards this code. I'll look.
Just in case, if you are interested in skimming over GPU programming - https://developer.nvidia.com/blog/even-easier-introduction-cuda/
There's also a higher level GPU library that supports some graph operations, is open source and has C++ and Python APIs - https://github.com/rapidsai/cugraph
Maybe it could be used as a framework? The "Subgraph Extraction" might be a good starting point? Its code is here - https://github.com/rapidsai/cugraph/blob/branch-0.16/cpp/src/community/extract_subgraph_by_vertex.cu
There is also layout/visualization there - "Force Atlas 2".