lava icon indicating copy to clipboard operation
lava copied to clipboard

Too many processes spawned by unittest (Potential memory leak)

Open bamsumit opened this issue 4 years ago • 6 comments

It's difficult to replicate. But there is the behavior

  1. pyb -E unit fails at some point with error like this
[INFO]  Executing unit tests from Python modules in /home/sshresth/lava-nc/lava/tests
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/usr/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/usr/lib/python3.8/multiprocessing/shared_memory.py", line 102, in __init__
    self._fd = _posixshmem.shm_open(
FileNotFoundError: [Errno 2] No such file or directory: '/psm_ee759658'
  1. Once this happens, we start getting memory leak errors when running unittest next time
$ python -m unittest discover tests/
.....................................................................................sss......Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
Runtime not started yet.
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
Runtime not started yet.
Runtime not started yet.
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7f9e9b076820>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
..Runtime not started yet.
......E..
======================================================================
ERROR: test_source_sink (lava.proc.conv.test_models.TestConvProcessModels)
Test for source-sink process.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/tests/lava/proc/conv/test_models.py", line 203, in test_source_sink
    sink.run(condition=run_condition, run_cfg=run_config)
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 398, in run
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 92, in initialize
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 146, in _build_sync_channels
  File "/home/sshresth/lava-nc/lava/src/lava/magma/compiler/builder.py", line 685, in build
  File "/home/sshresth/lava-nc/lava/src/lava/magma/compiler/channels/pypychannel.py", line 292, in __init__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 1385, in SharedMemory
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 502, in Client
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 628, in SocketClient
  File "/usr/lib/python3.8/socket.py", line 231, in __init__
OSError: [Errno 24] Too many open files

----------------------------------------------------------------------
Ran 105 tests in 8.874s

FAILED (errors=1, skipped=3)
/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 2 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

bamsumit avatar Nov 18 '21 21:11 bamsumit

The strange thing is it does not happen when I run the unittests separately:

lava-nc/lava/tests/lava/proc$ python -m unittest discover
.....
----------------------------------------------------------------------
Ran 5 tests in 9.532s

OK
lava-nc/lava/tests/lava/magma$ python -m unittest discover
.....................................................................................sss.....Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7fcc0abd28b0>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7fcc0abd28b0>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Exception ignored in: <function AbstractProcess.__del__ at 0x7fcc0abd28b0>
Traceback (most recent call last):
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 256, in __del__
    self.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/core/process/process.py", line 417, in stop
    self.runtime.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/runtime.py", line 272, in stop
    self._messaging_infrastructure.stop()
  File "/home/sshresth/lava-nc/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 56, in stop
    actor.join()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
...Runtime not started yet.
....
----------------------------------------------------------------------
Ran 100 tests in 4.423s

OK (skipped=3)

bamsumit avatar Nov 18 '21 21:11 bamsumit

Is it a Windows issue?

awintel avatar Nov 18 '21 21:11 awintel

No on ubuntu

bamsumit avatar Nov 18 '21 21:11 bamsumit

The reason seems to be the large number of processes spawned by the unittests (which will grow as the module grows). It needs more investigation on why there are so many processes being spawned. So I am not closing this issue for now.

Temporary solution for Linux Look at your current process limit ulimit -n and increase the limit as needed. ulimit -n <increased limit> Maximum limit can be up to 1048576

bamsumit avatar Nov 19 '21 15:11 bamsumit

had the same type of error i.e OSError: [Errno 24] Too many open files Solved by using ulimit -n 2048 before running pyb -E unit.

(lava) jeremy@jeremy-ZenBook-UX325EA-UX325EA:~/lava$ pyb -E unit
PyBuilder version 0.13.3
Build started at 2021-11-21 22:21:19
------------------------------------------------------------
[INFO]  Installing or updating plugin "pypi:pybuilder_bandit, module name 'pybuilder_bandit'"
[INFO]  Processing plugin packages 'pybuilder_bandit' to be installed with {}
[INFO]  Activated environments: unit
[INFO]  Building lava-nc version 0.1.0
[INFO]  Executing build in /home/jeremy/lava
[INFO]  Going to execute tasks: analyze, publish
[INFO]  Processing plugin packages 'coverage~=5.2' to be installed with {'upgrade': True}
[INFO]  Processing plugin packages 'flake8~=3.7' to be installed with {'upgrade': True}
[INFO]  Processing plugin packages 'pypandoc~=1.4' to be installed with {'upgrade': True}
[INFO]  Processing plugin packages 'setuptools>=38.6.0' to be installed with {'upgrade': True}
[INFO]  Processing plugin packages 'sphinx_rtd_theme' to be installed with {}
[INFO]  Processing plugin packages 'sphinx_tabs' to be installed with {}
[INFO]  Processing plugin packages 'twine>=1.15.0' to be installed with {'upgrade': True}
[INFO]  Processing plugin packages 'unittest-xml-reporting~=3.0.4' to be installed with {'upgrade': True}
[INFO]  Processing plugin packages 'wheel>=0.34.0' to be installed with {'upgrade': True}
[INFO]  Creating target 'build' VEnv in '/home/jeremy/lava/target/venv/build/cpython-3.10.0.final.0'
[INFO]  Processing dependency packages 'requirements.txt' to be installed with {}
[INFO]  Creating target 'test' VEnv in '/home/jeremy/lava/target/venv/test/cpython-3.10.0.final.0'
[INFO]  Processing dependency packages 'requirements.txt' to be installed with {}
[INFO]  Requested coverage for tasks: pybuilder.plugins.python.unittest_plugin:run_unit_tests
[INFO]  Running unit tests
[INFO]  Executing unit tests from Python modules in /home/jeremy/lava/tests
/home/jeremy/lava/src/lava/proc/lif/models.py:129: RuntimeWarning: divide by zero encountered in remainder
  wrapped_curr = np.mod(decayed_curr,
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
Runtime not started yet.
[INFO]  Executed 107 unit tests
[ERROR] Test has error: lava.magma.runtime.test_loihi_protocol.TestProcess.test_synchronization_single_process_model
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_loihi_protocol.py", line 63, in test_synchronization_single_process_model
    process.run(condition=RunSteps(num_steps=10), run_cfg=run_config)
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 562, in start
    self._process.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 54, in _launch
    child_r, parent_w = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_runtime.TestRuntime.test_executable_node_config_assertion
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_runtime.py", line 38, in test_executable_node_config_assertion
    runtime2.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 562, in start
    self._process.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 53, in _launch
    parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_get_set_var.TestGetSetVar.test_get_set_var_using_runtime
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_get_set_var.py", line 61, in test_get_set_var_using_runtime
    process.run(condition=RunSteps(num_steps=10), run_cfg=run_config)
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 562, in start
    self._process.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 53, in _launch
    parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_get_set_var.TestGetSetVar.test_get_set_var_using_var_api
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_get_set_var.py", line 104, in test_get_set_var_using_var_api
    process.run(condition=RunSteps(num_steps=10), run_cfg=run_config)
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 562, in start
    self._process.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 53, in _launch
    parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_runtime_service.TestRuntimeService.test_runtime_service_start_run
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_runtime_service.py", line 71, in test_runtime_service_start_run
    smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 552, in start
    reader, writer = connection.Pipe(duplex=False)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/connection.py", line 532, in Pipe
    fd1, fd2 = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_ref_var_ports.TestRefVarPorts.test_explicit_Ref_Var_port_read
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_ref_var_ports.py", line 182, in test_explicit_Ref_Var_port_read
    sender.run(RunSteps(num_steps=1, blocking=True),
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 552, in start
    reader, writer = connection.Pipe(duplex=False)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/connection.py", line 532, in Pipe
    fd1, fd2 = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_ref_var_ports.TestRefVarPorts.test_explicit_Ref_Var_port_write
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_ref_var_ports.py", line 113, in test_explicit_Ref_Var_port_write
    sender.run(RunSteps(num_steps=1, blocking=True),
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 552, in start
    reader, writer = connection.Pipe(duplex=False)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/connection.py", line 532, in Pipe
    fd1, fd2 = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_ref_var_ports.TestRefVarPorts.test_implicit_Ref_Var_port_read
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_ref_var_ports.py", line 211, in test_implicit_Ref_Var_port_read
    recv.run(RunSteps(num_steps=1, blocking=True),
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 552, in start
    reader, writer = connection.Pipe(duplex=False)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/connection.py", line 532, in Pipe
    fd1, fd2 = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_ref_var_ports.TestRefVarPorts.test_implicit_Ref_Var_port_write
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_ref_var_ports.py", line 147, in test_implicit_Ref_Var_port_write
    sender.run(RunSteps(num_steps=1, blocking=True),
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 552, in start
    reader, writer = connection.Pipe(duplex=False)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/connection.py", line 532, in Pipe
    fd1, fd2 = os.pipe()
OSError: [Errno 24] Too many open files

[ERROR] Test has error: lava.magma.runtime.test_ref_var_ports.TestRefVarPorts.test_unconnected_Ref_Var_ports
Traceback (most recent call last):
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/jeremy/lava/tests/lava/magma/runtime/test_ref_var_ports.py", line 93, in test_unconnected_Ref_Var_ports
    sender.run(RunSteps(num_steps=3, blocking=True),
  File "/home/jeremy/lava/src/lava/magma/core/process/process.py", line 398, in run
    self._runtime.initialize()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 90, in initialize
    self._build_message_infrastructure()
  File "/home/jeremy/lava/src/lava/magma/runtime/runtime.py", line 119, in _build_message_infrastructure
    self._messaging_infrastructure.start()
  File "/home/jeremy/lava/src/lava/magma/runtime/message_infrastructure/multiprocessing.py", line 40, in start
    self._smm.start()
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/managers.py", line 552, in start
    reader, writer = connection.Pipe(duplex=False)
  File "/home/jeremy/anaconda3/envs/lava/lib/python3.10/multiprocessing/connection.py", line 532, in Pipe
    fd1, fd2 = os.pipe()
OSError: [Errno 24] Too many open files

------------------------------------------------------------
BUILD FAILED - There were 10 error(s) and 0 failure(s) in unit tests (site-packages/pybuilder/plugins/python/unittest_plugin.py:109)
------------------------------------------------------------
Build finished at 2021-11-21 22:21:47
Build took 27 seconds (27958 ms)

jeremyforest avatar Nov 22 '21 03:11 jeremyforest

The reason seems to be the large number of processes spawned by the unittests (which will grow as the module grows). It needs more investigation on why there are so many processes being spawned. So I am not closing this issue for now.

Temporary solution for Linux Look at your current process limit ulimit -n and increase the limit as needed. ulimit -n <increased limit> Maximum limit can be up to 1048576

Thank you! I also encountered this error after expanding a snn from 21 LIF neurons with 68 synapses to 24 LIF neurons with 80 synapses, on Ubuntu 21.10. Raising the ulimit from 1024 to 2048 resolved the issue for me :)

a-t-0 avatar Apr 15 '22 14:04 a-t-0