sPyNNaker
sPyNNaker copied to clipboard
TypeError when adding an EthernetControlPopulation after simulation has been started
I have yet another problem with Spinnaker-Integration in NRP. I hope you still have some patience left for me :-)
Should it be possible to add or change an EthernetControlPopulation after a simulation has been started? When I try to do this, I get TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'
Minimum example + workaround
from pyNN import spiNNaker as sim
import os
import tempfile
import threading
import time
import traceback
from enum import Enum
from spynnaker.pyNN.external_devices_models import \
AbstractMulticastControllableDevice, \
AbstractEthernetTranslator
from spinn_front_end_common.utilities.globals_variables import get_simulator
class SendType(Enum):
SEND_TYPE_ACCUM = 2
class MyDevice(AbstractMulticastControllableDevice):
@property
def device_control_key(self): return 123
@property
def device_control_max_value(self): return 10.0
@property
def device_control_min_value(self): return 0.0
@property
def device_control_partition_id(self): return "my partition id"
@property
def device_control_send_type(self): return SendType.SEND_TYPE_ACCUM
@property
def device_control_timesteps_between_sending(self): return 500
@property
def device_control_uses_payload(self): return False
class MyAbstractEthernetTranslator(AbstractEthernetTranslator):
def translate_control_packet(self, multicast_packet): pass
os.chdir(tempfile.mkdtemp())
print(os.getcwd())
sim.setup()
device = MyDevice()
translator = MyAbstractEthernetTranslator()
model = sim.external_devices.ExternalDeviceLifControl([device], False, translator)
runner = threading.Thread(target=sim.external_devices.run_forever)
runner.daemon = True
runner.start()
print("sleeping 50s until brain is loaded")
time.sleep(50) # it takes about 47s for the brain to load onto the board
try:
print("trying to add EthernetControlPopulation")
population = sim.external_devices.EthernetControlPopulation(1, model)
except Exception as e:
print(e)
traceback.print_exc()
if get_simulator(). _current_run_timesteps is None:
print("applying workaround")
get_simulator(). _current_run_timesteps = 0
print("trying to add EthernetControlPopulation again")
population = sim.external_devices.EthernetControlPopulation(1, model)
print("sleep 2s")
time.sleep(2)
sim.end()
relevant error message + stacktrace
Traceback (most recent call last):
File "/home/bjuergens/Documents/MVE_restart_spinnaker/src/test_tf.py", line 61, in <module>
population = sim.external_devices.EthernetControlPopulation(1, model)
File "/home/bjuergens/.opt/platform_venv/local/lib/python2.7/site-packages/spynnaker8/external_devices/__init__.py", line 173, in EthernetControlPopulation
population = Population(n_neurons, model, label=label)
File "/home/bjuergens/.opt/platform_venv/local/lib/python2.7/site-packages/spynnaker8/models/populations/population.py", line 70, in __init__
Recorder.__init__(self, population=self)
File "/home/bjuergens/.opt/platform_venv/local/lib/python2.7/site-packages/spynnaker8/models/recorder.py", line 50, in __init__
self._recording_start_time = get_simulator().t
File "/home/bjuergens/.opt/platform_venv/local/lib/python2.7/site-packages/spynnaker8/spinnaker.py", line 244, in t
self._current_run_timesteps * (self.machine_time_step / 1000.0))
TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'
Note: I do not know if the workaround actually solves anything or if it just makes the error message go away without actually adding a functional EthernetControlPopulation
The error is confusing but the underlying cause is that you can NOT add a population to a running simulation!
The work around actually only makes the situation worse as it changes the state that the simulator does for its checks.
get_simulator(). _current_run_timesteps is None signals that the system is currently running for an undetermined time (we call it run forever) while a 0 say it has not yet run.
The reason why adding a population is not possible is because we load all binaries on to the machine before running. Adding a population would require adding new binaries which is not possible at the same time as running.
Stopping the run and then adding a population and then continuing is also not possible (or at least too hard) as that at worse more boards than currently allocated. But also means changes to data in existing binaries such as changing routing tables.
We do support Sim..setup .... (adding population ect) sim.run(100) .... (adding population ect) sim.reset()
But that cause all the state in the simulation including time to be set back to state at time 0
I also do not know it reset is supported after a runner = threading.Thread(target=sim.external_devices.run_forever) and the the normal stop method.
The error is confusing but the underlying cause is that you can NOT add a population to a running simulation!
Oki. Thanks
I also do not know it reset is supported after a runner = threading.Thread(target=sim.external_devices.run_forever) and the the normal stop method
would the problem be in this case the sim.external_devices.run_forever
or the threading.Thread
?
The difficult bit after a run_forever/threading is knowing at what time the cores stopped, checking they all stopped at the same time and then restarting them at that time.
I don't think we ever considered that.
Would it not be possible to put the sim.external_devices.EthernetControlPopulation(1, model) in before starting but then to only give it data when you need to
The difficult bit after a run_forever/threading is knowing at what time the cores stopped, checking they all stopped at the same time and then restarting them at that time.
I don't think we ever considered that.
Reset should work with this, since then we are going back to time 0. I don't know if we have ever tested that, but there is no time like the present!
resetting the simulation seems to work fine. Thanks.
What would be a good check to do before resetting the simulation? I.e. I probably don't need to reset it every single time I add a Population. Is it enough if I check if the simulation has been started? (If so, what would be a good way to do that?)
I am currently checking if get_simulator()._current_run_timesteps is None
, but that seems a little hacky to me. I'm wondering if you have a better idea.
Once the command to start a simulation has been given either through sim.run(..) or runner.start() it is not possible to add a population successfully.
This is because the simulator will have started the preparation phase. At best a population added at this time will be ignored at worst it will cause so mismatch between data from different preparation phases with unknown consequences.
The main phases are: Partitioning the populations and Projections so that they fit on individual cores Placing: Determine where each of the above goes. Routing: Work out how packets are sent between each of the above Loading: Loading the executables and their data onto the machine. Running:
There is no way to add additional populations in during these phases. Therefor you not add any new populations or projections between start and the matching stop/end
After the end of run or the stop/end you can add populations and then call reset but I highly recommend first calling reset and then adding the population.
Please also be aware that calling start and then reset without a proper stop/end is also highly likely to give undesired output as we have never even considered that workflow.
When I get around to doing the cleanup in https://github.com/SpiNNakerManchester/sPyNNaker8/issues/374 The system will fail fast if a population or projection is added during run INCLUDING the preparation phases. At which point it will continue to refuse any command except a brand new setup.
The way the system works the only way to add a Population after a simulation has started is to reset the simulation completely.
Change this is totally out of scope especially now that the core development team is being reduced due to the end of various funding projects.