mesa icon indicating copy to clipboard operation
mesa copied to clipboard

Issue with data collection of PropertyLayer objects

Open dylan-munson opened this issue 1 year ago • 6 comments

Describe the bug When using a PropertyLayer class with DataCollector, the data is not captured properly at each model step. Specifically, for each model step, the grid of values for the PropertyLayer instance are stored only as they appear in the final model step, i.e. all steps store the same array of values.

Expected behavior The DataCollector should store at each step the value of the PropertyLayer at that step as a model variable. This would allow visualization of the distribution of the variable at each step of the model to examine, for example, the accumulation of pollution in the Sugarscape model.

To Reproduce The toy model below reproduces the issue. Check the 'Pollution' column of the model_results object to see the problem. Minimally, one would expect a grid of 0's for the first model step, which is not the case here either.

import mesa

class testAgent(mesa.Agent):
    def __init__(self, unique_id, model, pos):
        super().__init__(unique_id, model)
        self.pos = pos
    def pollute(self):
        self.model.grid.properties["pollution"].modify_cell(self.pos, lambda p: p+1)

class testModel(mesa.Model):
    def __init__(self, width = 20, height = 20, initial_population=200):
        super().__init__()
        self.width = width
        self.height = height
        self.initial_population = initial_population
        
        # initiate activation schedule
        self.schedule = mesa.time.RandomActivationByType(self) #activate different type of agents in different order each time
        # initiate mesa grid class
        self.grid = mesa.space.MultiGrid(self.width, self.height, torus=False)
        #add pollution property layer
        pollution_layer = mesa.space.PropertyLayer(name = "pollution", width = self.width, height = self.height, default_value = 0)
        self.grid.add_property_layer(pollution_layer)
        
        self.datacollector = mesa.DataCollector(model_reporters = {"Pollution": lambda m: m.grid.properties["pollution"].data})
        
        agent_id = 0
        for i in range(self.initial_population):
            # get agent position
            x = self.random.randrange(self.width)
            y = self.random.randrange(self.height)
            agent = testAgent(agent_id, self, (x, y))
            # place agent
            self.grid.place_agent(agent, (x, y))
            self.schedule.add(agent)
            agent_id += 1
    
    def randomizeAgents(self):
        agents_shuffle = list(self.schedule.agents_by_type[testAgent].values())
        self.random.shuffle(agents_shuffle)

        return agents_shuffle
    
    def step(self):
        agents_shuffle = self.randomizeAgents()
        for agent in agents_shuffle:
            agent.pollute()
        
        self.datacollector.collect(self)
       
    def run_model(self, step_count=20):
        for i in range(step_count):
            self.step()
            
model = testModel()
model.run_model()
model_results = model.datacollector.get_model_vars_dataframe()

Additional context This should be running on the most recent version of Mesa (my version was updated about a week ago).

dylan-munson avatar May 05 '24 14:05 dylan-munson

Thanks for opening this issue and writing it up!

@rht I looked a bit into this issue and I can reproduce it. I think it has to do with either the model clock or the datacollector. Model._steps isn't updated for example with each step. It might be related to how #1942 was implemented. Do you have the opportunity to investigate this further?

EwoutH avatar May 05 '24 17:05 EwoutH

This is expected in the new API. Because you are not using model.schedule.step(), the model._steps doesn't get updated automatically. You need to manually advance the clock. See https://github.com/projectmesa/mesa-examples/blob/1e0d6b855c853c1dfa749348445a31000b4daca4/examples/schelling_experimental/model.py#L65-L77

rht avatar May 06 '24 07:05 rht

Thanks. I tried adding a call self_advance_time() just before data collection in the toy model and the problem persists. I should also note than in the model I was working with where this problem first arose, which is a version of the sugarscape_g1mt model from the mesa examples catalog, there is a call self._steps +=1 just before data collection, which I assume does essentially same thing. Thus, neither one seems to solve the problem.

dylan-munson avatar May 06 '24 08:05 dylan-munson

I can reproduce this bug using the AgentSet API and other modern Mesa best practices.

import mesa

class testAgent(mesa.Agent):
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)
        self.pos: tuple
    def pollute(self):
        self.model.grid.properties["pollution"].modify_cell(self.pos, lambda p: p+1)

class testModel(mesa.Model):
    def __init__(self, width = 20, height = 20, initial_population=200):
        super().__init__()
        self.width = width
        self.height = height
        self.initial_population = initial_population
        
        # initiate mesa grid class
        self.grid = mesa.space.MultiGrid(self.width, self.height, torus=False)
        #add pollution property layer
        pollution_layer = mesa.space.PropertyLayer(name = "pollution", width = self.width, height = self.height, default_value = 0.0, dtype=float)
        self.grid.add_property_layer(pollution_layer)
        
        self.datacollector = mesa.DataCollector(model_reporters = {
            "Pollution": lambda m: m.grid.properties["pollution"].data,
            "Step": "_steps"
        })
        
        agent_id = 0
        for i in range(self.initial_population):
            # get agent position
            agent = testAgent(agent_id, self)
            self.grid.place_agent(agent, pos=mesa.model.random.choice(list(self.grid.empties)))
            agent_id += 1
            
    
    def step(self):
        self.agents.shuffle().do("pollute")

        print(f"Step {self._steps}: {self.grid.properties["pollution"].data[:1]}")
        self.datacollector.collect(self)
        self._advance_time()

# Run the model
model = testModel()
for i in range(20):
    model.step()

model_results = model.datacollector.get_model_vars_dataframe()
model_results

It prints the correct values each step, so it seems the bug is really in the datacollector itself.

EwoutH avatar May 06 '24 09:05 EwoutH

I suspect for some reason it doesn't save a copy of the values at that timestep to model_vars, but a reference to the values that gets continuously updated with each step. @rht do you have an idea where that could happen?

EwoutH avatar May 06 '24 09:05 EwoutH

Found the issue and implemented a fix in #2129.

EwoutH avatar May 06 '24 09:05 EwoutH