Issue with data collection of PropertyLayer objects
Describe the bug When using a PropertyLayer class with DataCollector, the data is not captured properly at each model step. Specifically, for each model step, the grid of values for the PropertyLayer instance are stored only as they appear in the final model step, i.e. all steps store the same array of values.
Expected behavior The DataCollector should store at each step the value of the PropertyLayer at that step as a model variable. This would allow visualization of the distribution of the variable at each step of the model to examine, for example, the accumulation of pollution in the Sugarscape model.
To Reproduce The toy model below reproduces the issue. Check the 'Pollution' column of the model_results object to see the problem. Minimally, one would expect a grid of 0's for the first model step, which is not the case here either.
import mesa
class testAgent(mesa.Agent):
def __init__(self, unique_id, model, pos):
super().__init__(unique_id, model)
self.pos = pos
def pollute(self):
self.model.grid.properties["pollution"].modify_cell(self.pos, lambda p: p+1)
class testModel(mesa.Model):
def __init__(self, width = 20, height = 20, initial_population=200):
super().__init__()
self.width = width
self.height = height
self.initial_population = initial_population
# initiate activation schedule
self.schedule = mesa.time.RandomActivationByType(self) #activate different type of agents in different order each time
# initiate mesa grid class
self.grid = mesa.space.MultiGrid(self.width, self.height, torus=False)
#add pollution property layer
pollution_layer = mesa.space.PropertyLayer(name = "pollution", width = self.width, height = self.height, default_value = 0)
self.grid.add_property_layer(pollution_layer)
self.datacollector = mesa.DataCollector(model_reporters = {"Pollution": lambda m: m.grid.properties["pollution"].data})
agent_id = 0
for i in range(self.initial_population):
# get agent position
x = self.random.randrange(self.width)
y = self.random.randrange(self.height)
agent = testAgent(agent_id, self, (x, y))
# place agent
self.grid.place_agent(agent, (x, y))
self.schedule.add(agent)
agent_id += 1
def randomizeAgents(self):
agents_shuffle = list(self.schedule.agents_by_type[testAgent].values())
self.random.shuffle(agents_shuffle)
return agents_shuffle
def step(self):
agents_shuffle = self.randomizeAgents()
for agent in agents_shuffle:
agent.pollute()
self.datacollector.collect(self)
def run_model(self, step_count=20):
for i in range(step_count):
self.step()
model = testModel()
model.run_model()
model_results = model.datacollector.get_model_vars_dataframe()
Additional context This should be running on the most recent version of Mesa (my version was updated about a week ago).
Thanks for opening this issue and writing it up!
@rht I looked a bit into this issue and I can reproduce it. I think it has to do with either the model clock or the datacollector. Model._steps isn't updated for example with each step. It might be related to how #1942 was implemented. Do you have the opportunity to investigate this further?
This is expected in the new API. Because you are not using model.schedule.step(), the model._steps doesn't get updated automatically. You need to manually advance the clock. See https://github.com/projectmesa/mesa-examples/blob/1e0d6b855c853c1dfa749348445a31000b4daca4/examples/schelling_experimental/model.py#L65-L77
Thanks. I tried adding a call self_advance_time() just before data collection in the toy model and the problem persists. I should also note than in the model I was working with where this problem first arose, which is a version of the sugarscape_g1mt model from the mesa examples catalog, there is a call self._steps +=1 just before data collection, which I assume does essentially same thing. Thus, neither one seems to solve the problem.
I can reproduce this bug using the AgentSet API and other modern Mesa best practices.
import mesa
class testAgent(mesa.Agent):
def __init__(self, unique_id, model):
super().__init__(unique_id, model)
self.pos: tuple
def pollute(self):
self.model.grid.properties["pollution"].modify_cell(self.pos, lambda p: p+1)
class testModel(mesa.Model):
def __init__(self, width = 20, height = 20, initial_population=200):
super().__init__()
self.width = width
self.height = height
self.initial_population = initial_population
# initiate mesa grid class
self.grid = mesa.space.MultiGrid(self.width, self.height, torus=False)
#add pollution property layer
pollution_layer = mesa.space.PropertyLayer(name = "pollution", width = self.width, height = self.height, default_value = 0.0, dtype=float)
self.grid.add_property_layer(pollution_layer)
self.datacollector = mesa.DataCollector(model_reporters = {
"Pollution": lambda m: m.grid.properties["pollution"].data,
"Step": "_steps"
})
agent_id = 0
for i in range(self.initial_population):
# get agent position
agent = testAgent(agent_id, self)
self.grid.place_agent(agent, pos=mesa.model.random.choice(list(self.grid.empties)))
agent_id += 1
def step(self):
self.agents.shuffle().do("pollute")
print(f"Step {self._steps}: {self.grid.properties["pollution"].data[:1]}")
self.datacollector.collect(self)
self._advance_time()
# Run the model
model = testModel()
for i in range(20):
model.step()
model_results = model.datacollector.get_model_vars_dataframe()
model_results
It prints the correct values each step, so it seems the bug is really in the datacollector itself.
I suspect for some reason it doesn't save a copy of the values at that timestep to model_vars, but a reference to the values that gets continuously updated with each step. @rht do you have an idea where that could happen?
Found the issue and implemented a fix in #2129.