nest-simulator
nest-simulator copied to clipboard
Improve speed of dump_connections() in layer_impl.h
Instead of calling get_connections() inside a loop for every source node, it calls get_connections() once for all connections between layers.
The code considers that:
- source nodes positions (src_vec) is ordered by source node.
- connectome is ordered by source node.
fixes #3142
This is a proposed solution for issue #3142
This implementation only works for non-MPI and in some cases (not all) of MPI. Hence, DO NOT USE !!!!
@xavierotazuGDS Thank you for your PR! Since there still seems to be a problem with the code as of your last message, I have converted it to "Draft" status. You can change that back again later.
I have uploaded a new version. No bugs now on MPI. I do not change the Draft status because I am not sure how to do it.
Status changed from "Draft" to "Open".
Thanks for the update, xavierotazuGDS!
There are/were still some formatting issues (mostly regarding white-spaces as far as I see), but they need to be fixed to keep code in a consistently formatted state (see Contribute to NEST).
Since you seem to be a first time contributor, please also send a Contributor Agreement to [email protected] so we can accept and correctly acknowledge your contribution.
I just saw, that you already did send the CA! My appologies.
terhorstd,
I believe I solved all my code style issues. The other issues are in methods I have not modified (they are clang-format issues from the original code). Anyway, I could try to solve them (I will need some time).
I took a look at clang-issues in methods I have not modified, but suggested changes are so big code blocks that I would not know what I am doing. Hence, I prefer not to modify clang-format issues outside the method I modifed.
The version of the code formatter changed some time back, this is why there are complaints. Everything is already fixed in the main branch, which your branch is several commits behind (see on the main page of your repo "… behind"). Try to merge those into your branch, that should fix all spots that you didn't touch.
I ran some basic benchmarks on the JURECA cluster with the following setup:
- 1 node, 64 threads
- Scaling number of layers (100x100 grid each) connected to every other layer (and itself) with a 25x25 mask
- Average over 10 runs, call
DumpLayerConnections
after each layer connection
Layers | Num Total Conns | Time | Final Memory | Branch |
---|---|---|---|---|
1 | 6.250.000 | 19.76s | 2873MB | Improved |
2 | 25.000.000 | 81.15s | 3891MB | Improved |
4 | 100.000.000 | 322.11s | 7743MB | Improved |
1 | 6.250.000 | 28.22s | 1600MB | Master |
2 | 25.000.000 | 139.72s | 2735MB | Master |
4 | 100.000.000 | 1070.84s | 6330MB | Master |
This seems to indicate that it scales pretty linearly, not even increasing memory usage by much
Pull request automatically marked stale!
@otcathatsya I have worked a bit more on the improved dump_connection(). You can find my current status here: https://github.com/heplesser/nest-simulator/tree/xdump. Could you run a subset of your benchmarks again on that branch to ensure I did not introduce performance regressions?
@otcathatsya I have worked a bit more on the improved dump_connection(). You can find my current status here: https://github.com/heplesser/nest-simulator/tree/xdump. Could you run a subset of your benchmarks again on that branch to ensure I did not introduce performance regressions?
My branch is now merged here, so you can work on @xavierotazuGDS's branch for this PR.
Sorry for the hold-up, here are the benchmarks for the latest commit:
@otcathatsya Thank you for the data—and good they look, so here we merge ...