CoreNeuron
CoreNeuron copied to clipboard
Added hybrid MPI+OpenMP test in CI
- Added
nrntraub
test and run it with 4 ranks and 9 threads on BB5 with the SoACoreNEURON
build - Uses https://github.com/iomaganaris/nrntraub/tree/icei which creates the
coredat
by default in theNEURON
run - Closes https://github.com/BlueBrain/CoreNeuron/issues/292
@pramodk I didn't manage to run NEURON
with the threading
option still with nrntraub
. If you think that this should be tested maybe we can have a look together at some point.
Also, let me know if I should create a PR for my fork of nrntraub
I think we should do that. Can you put here instructions with error message and tag Michale here?
Hello @nrnhines
We were trying to run the nrntraub
test from https://github.com/pramodk/nrntraub/tree/icei with threading enabled in NEURON
to launch CoreNEURON
from NEURON
and test OpenMP
.
After cloning the repo I did the following:
nrnivmodl mod
srun -n 1 ./x86_64/special -c nthread=9 -mpi -c mytstop=100 -c use_coreneuron=0 init.hoc
Note that I am using 1 rank because pc.nthread
gets set only if pc.nhost == 1
and I am setting use_coreneuron=0
for debugging in this case. With use_coreneuron=1
there is the same issue.
And I get the following error:
...
SetupTime: 4.8000002
mytstop 100
/gpfs/bbp.cscs.ch/project/proj16/magkanar/spack/software/install/linux-rhel7-x86_64/intel-19.0.4.243/neuron-develop-3csnze/x86_64/bin/nrniv: usable mindelay is 0 (or less than dt for fixed step method)
in init.hoc near line 65
prun()
^
finitialize(-70)
init()
stdinit()
prun()
I figured out that the issue comes from calling stdinit()
from prun()
in hoc/parlib.hoc
.
I am using NEURON
master and Intel
compiler.
Could you help us with this issue?
Thank you very much in advance!
If you are using threads you cannot have any NetCon.delay = 0. (or less than dt). Of the 109982 NetCon, 265 of them have a delay of 0. Just to see if that is the problem try again with
diff --git a/hoc/parlib2.hoc b/hoc/parlib2.hoc
index d9eb164..1fbdee3 100755
--- a/hoc/parlib2.hoc
+++ b/hoc/parlib2.hoc
@@ -50,7 +50,7 @@ proc par_netstim_create() {local gid localobj cell, syn, nc, ns, r
netstims.append(ns)
nc = new NetCon(ns.pp, syn)
netstim_netcons.append(nc)
- nc.delay = 0
+ nc.delay = 1
r = new Random()
r.negexp(1)
// r.Isaac64(netstim_random_seedoffset + netstim_base_)
For mpi and nthread=1 i is generally ok to have NetCon.delay=0 but only if they are not interprocessor NetCon (ie. source and target must be on same process).
By the way, I noticed another problem when launching python from within the nrntraub repository.
hines@hines-T7500:~/models/nrntraub-icei$ python
Python 3.7.6 (default, Feb 17 2020, 15:09:28)
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import neuron
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/hines/neuron/nrncmake/build/install/lib/python/neuron/__init__.py", line 132, in <module>
import nrn
ModuleNotFoundError: No module named 'nrn'
>>>
This seems to be an artifact of having a 'hoc' folder in the repository.
I got some time to work again on this test. Thank you very much for your suggestion @nrnhines to set nc.delay = 1
. NEURON
and CoreNEURON
with threading worked with this.
I get however the following issues with threading
enabled.
First, NEURON
generates different spikes when the simulation runs with more that one thread
and more than one mpi rank
than when running the simulation with 1 mpi rank
and multiple threads
or multiple mpi ranks
and no threading
.
For example:
bash-4.2$ srun -n 1 ./x86_64/special -mpi -c use_coreneuron=0 -c nthread=36 -c mytstop=100 init.hoc
bash-4.2$ srun -n 4 ./x86_64/special -mpi -c use_coreneuron=0 -c nthread=9 -c mytstop=100 init.hoc
bash-4.2$ sort -n -k'1,1' -k2 < out1.dat | awk 'NR==1 { print; next } { printf "%.3f\t%d\n", $1, $2 }' > out1.sorted
bash-4.2$ sort -n -k'1,1' -k2 < out4.dat | awk 'NR==1 { print; next } { printf "%.3f\t%d\n", $1, $2 }' > out4.sorted
bash-4.2$ sdiff -s out1.sorted out4.sorted
10.375 186 <
> 10.400 186
> 11.125 199
11.150 199 <
> 12.950 220
12.975 220 <
> 13.000 188
13.025 188 <
13.025 264 | 13.050 264
> 13.525 102
13.550 102 <
13.675 288 <
> 13.700 288
> 13.925 323
13.950 323 <
14.275 312 <
> 14.300 312
> 14.300 318
14.325 318 | 14.350 87
14.350 192 <
14.375 87 <
...
During the first timesteps the spikes are the same but then there are these differences in the timesteps that the spikes are generated. In most cases the generated spikes differ by 1 timestep. Running NEURON
with 36 MPI ranks
and 1 thread
generates the same spikes with 1 MPI rank
and 36 threads
.
The other issue is with the spikes generated by CoreNEURON
. In all of the above cases CoreNEURON
generates the same spikes with NEURON
in the beginning but then after a timestep spikes start to shift in time. For example:
bash-4.2$ srun -n 4 ./x86_64/special -mpi -c use_coreneuron=1 -c nthread=9 -c mytstop=100 init.hoc
bash-4.2$ sort -n -k'1,1' -k2 < out.dat | awk 'NR==1 { print; next } { printf "%.3f\t%d\n", $1, $2 }' > out4.cn.sorted
bash-4.2$ sdiff -s out4.sorted out4.cn.sorted
bash-4.2$ sdiff -s out4.sorted out4.cn.sorted | more
> 5.900 160
> 6.050 176
> 6.050 180
6.750 160 <
6.750 176 <
6.825 180 <
6.925 188 <
> 6.950 188
6.975 168 <
> 7.000 168
> 7.375 287
7.400 287 <
> 7.550 290
7.575 290 <
...
I am using my fork of nrntraub
and the branch icei
from here which includes the change in the delay and allows the selection of the number of threads
when more than 1 MPI ranks
are used.
Are the issues mentioned before related to the thread implementation or there is something going on with the test?
Any help would be greatly appreciated.
Thank you very much, Ioannis
@nrnhines : Similar to olfactory bulb model, do you think the above described issue might be with the model itself? In that case I will go ahead and use whatever baseline model provide with X mpi ranks and Y threads per mpi thread.
Discrepancies between NEURON and CoreNEURON in this situation are presumptively bugs. I assume there is no intra-NEURON or intra CoreNEURON differences on this time scale with different nhost and nthread.