nrn icon indicating copy to clipboard operation
nrn copied to clipboard

Neuron 9 and multisplit

Open SteMasoli opened this issue 3 years ago • 4 comments

Context

Overview of the issue

Hi. After a distro upgrade I have recompiled the latest NEURON code, but now the following code does not work in setting up multiple cores with multisplit.

h.load_file("parcom.hoc") p = h.ParallelComputeTool() p.change_nthread(8,1) p.multisplit(1)

There are no visible errors but models run only on a single core instead of many.

Expected result/behavior

The models work just fine but they run only one core. The expected behavior is to use multisplit on all available cores.

NEURON setup

  • Version: [master branch / VERSION 9.0.dev-72-gb226aabd master (b226aabd) 2022-09-16]
  • Installation method [cmake build]
  • OS + Version: [Mint 21 based on Ubuntu 22.04 LTS]
  • Compiler + Version: [gcc (Ubuntu 11.2.0-19ubuntu1)]

Minimal working example - MWE

https://senselab.med.yale.edu/ModelDB/showmodel.cshtml?model=266806#tabs-2 Golgi cell model, morphology_1. Comment cao CONSTANT in cdp5 mod_file nrnivmodl mod_files nrngui -python protocols/01_SS.py

Logs

No error log

SteMasoli avatar Sep 18 '22 14:09 SteMasoli

I agree there is a problem. Version 8.2.1 simulates on my apple M1 in 17.5s (starts with 8 threads) and current master simulates in 55s. Will look into it.

nrnhines avatar Sep 19 '22 14:09 nrnhines

The problem began with 9.0.dev-7-gbd2c7ac2f 'Replace pthreads with std::thread and friends (#1859)' and 9.0.dev-6-ga10145903 does not exhibit the problem.

nrnhines avatar Sep 19 '22 16:09 nrnhines

Apparently this is not specifically an issue with multisplit as performance results for nrntest/thread are (note cache1 means cvode.cache_efficient(1))

hines@michaels-macbook-pro-2 thread % nrniv perf1.hoc           
NEURON -- VERSION 8.2.1 HEAD (c15906924) 2022-08-09
...
nt	 cache0	 cache1
1	 1.79	 0.9
2	 1.34	 0.49
4	 1.1	 0.28
8	 0.66	 0.24

whereas

hines@michaels-macbook-pro-2 thread % nrniv perf1.hoc
NEURON -- VERSION 9.0.dev-72-gb226aabd2 master (b226aabd2) 2022-09-16
...
nt	 cache0	 cache1
1	 1.88	 0.9
2	 1.44	 0.9
4	 1.35	 0.89
8	 1.35	 0.9

nrnhines avatar Sep 19 '22 17:09 nrnhines

Here are the details about what is needed to run the above tests.

git clone [email protected]:neuronsimulator/nrntest
cd nrntest/thread
nrniv perf1.hoc
nrniv perf2.hoc

nt and cache1 are the important columns. nt is the number of threads, and ideally cache1 should get smaller by a factor of 2 for each line. The 8.2.1 results for perf2.hoc on my apple M1 are

nt	 cache0	 cache1
1	 0.36	 0.32
2	 0.19	 0.18
4	 0.11	 0.11
8	 0.11	 0.0999999

nrnhines avatar Sep 20 '22 09:09 nrnhines

Via #1993 I get: image

alexsavulescu avatar Sep 26 '22 15:09 alexsavulescu

Also:

bin/nrniv perf1.hoc 
nt       cache0  cache1
1        10.24   2.61
2        6.02    1.31
4        3.56    0.7
8        2.51    0.56
bin/nrniv perf2.hoc 

nt       cache0  cache1
1        1.11    0.93
2        0.63    0.49
4        0.4     0.31
8        0.27    0.22

alexsavulescu avatar Sep 27 '22 09:09 alexsavulescu

#1993 now runs the Golgi cell model mentioned above on Mac M1 in 17.5 s.

nrnhines avatar Sep 27 '22 16:09 nrnhines