mesa icon indicating copy to clipboard operation
mesa copied to clipboard

Cells keep cache after removing connections (3.2.0 version)

Open MalchuL opened this issue 5 months ago • 5 comments

Describe the bug Hex Grid remove_cell doesn't clear cache on cells. I have a simple example where I removing diagonal cells and place agents to separate them. After creating agents I added new cells at same places but without connections. When I run script, I found that some cells uses previous connection before removing cells. I can reproduce this bug at HexGrid.

Workaround

create NoCachedCell and use as cell class

To Reproduce

"""
Simple example demonstrating Mesa's OrthogonalMooreGrid with CellAgent.

This example creates a 5x5 grid where agents can move and interact with neighbors
using the Moore neighborhood (8 surrounding cells).
"""

from random import Random
from typing import Optional, List
import mesa
from mesa.discrete_space import CellAgent, OrthogonalMooreGrid, HexGrid
from mesa.discrete_space.cell import Cell, Coordinate
from mesa.discrete_space.cell_collection import CellCollection



class NoCachedCell(Cell):
    
    @property
    def neighborhood(self) -> CellCollection[Cell]:
        return self.get_neighborhood()

class MoneyAgent(CellAgent):
    """An agent with fixed initial wealth that can move around the grid."""
    
    def __init__(self, model: "MoneyModel", cell) -> None:
        """
        Create a new agent.
        
        Args:
            model: Model instance
            cell: The cell where this agent is located
        """
        super().__init__(model)
        self.cell = cell
        self.prev_cell = cell
        self.wealth = 1
    
    def move(self) -> None:
        """Move to a random cell in the Moore neighborhood."""
        try:
            prev_cell = self.cell
            self.cell = self.cell.neighborhood.select_random_cell()
            self.prev_cell = prev_cell
        except IndexError as e:
            print(self.prev_cell.coordinate, self.cell.coordinate, [c.coordinate for c in self.prev_cell.neighborhood.cells])
            raise e
    
    def give_money(self) -> None:
        """Give 1 unit of wealth to another agent in the same cell."""
        # Get all agents in the current cell except self
        cellmates = [a for a in self.cell.agents if a is not self]
        
        if self.wealth > 0 and cellmates:
            other_agent = self.random.choice(cellmates)
            other_agent.wealth += 1
            self.wealth -= 1


class MoneyModel(mesa.Model):
    """A model with agents moving on a grid and exchanging money."""
    
    def __init__(self, n_agents: int = 10, width: int = 5, height: int = 5, seed: Optional[int] = None) -> None:
        """
        Create a new Money Model.
        
        Args:
            n_agents: Number of agents to create
            width: Width of the grid
            height: Height of the grid
            seed: Random seed for reproducibility
        """
        super().__init__(seed=seed)
        self.num_agents = n_agents
        
        # Create grid with Moore neighborhood and allow multiple agents per cell
        self.grid = HexGrid(
            (width, height),
            torus=True,  # Wrap around edges
            capacity=None,  # Max agents per cell
            random=self.random,
            #cell_klass=NoCachedCell
        )
        # Remove and add cells to the grid to test the grid
        for i in range(width):
            self.grid.remove_cell(self.grid[(i, i)])
        
        
        # Create agents and place them randomly
        agents = MoneyAgent.create_agents(
            self,
            self.num_agents,
            # Randomly select cells for each agent
            self.random.choices(self.grid.all_cells.cells, k=self.num_agents)
        )
        # Add cells to the grid to test the grid
        for i in range(width):
            self.grid.add_cell(self.grid.cell_klass((i, i), random=self.random))

    def step(self) -> None:
        """
        Advance the model by one step:
        1. Have agents move randomly
        2. Have agents give money to others in their cell
        """
        self.agents.shuffle_do("move")
        self.agents.do("give_money")


def print_grid_state(model: MoneyModel) -> None:
    """
    Print a simple text representation of the grid.
    
    Args:
        model: The model to display
    """
    print("\nGrid State (number of agents in each cell):")
    print("-" * (model.grid.width * 4 + 3))
    
    num_agents = 0
    for y in range(model.grid.height-1, -1, -1):
        print("|", end=" ")
        for x in range(model.grid.width):
            cell = model.grid[x, y]
            print(f"{len(cell.agents):2d}", end=" ")
            num_agents += len(cell.agents)
        print("|")
    print("-" * (model.grid.width * 4 + 3))
    print(f"Total number of agents: {num_agents}")


def main():
    """Run a demonstration of the money model."""
    # Create model with 20 agents on a 5x5 grid
    model = MoneyModel(n_agents=20, width=5, height=5, seed=42)
    
    # Run for 10 steps
    for step in range(100):
        print(f"\nStep {step}")
        print_grid_state(model)
        
        # Print wealth stats
        wealth_dist = [agent.wealth for agent in model.agents]
        print("\nWealth Distribution:")
        print(f"Min: {min(wealth_dist)}, Max: {max(wealth_dist)}, "
              f"Average: {sum(wealth_dist)/len(wealth_dist):.2f}")
        
        # Step the model
        model.step()


if __name__ == "__main__":
    main() 

MalchuL avatar Jul 11 '25 14:07 MalchuL

Thanks @MalchuL!

@quaquel Bringing this to your attention/ any thoughts?

tpike3 avatar Jul 25 '25 11:07 tpike3

@quaquel could you reproduce this issue?

EwoutH avatar Sep 24 '25 05:09 EwoutH

I haven't looked at it yet. I missed it over the summer break.

quaquel avatar Sep 24 '25 12:09 quaquel

I had a quick look. I don't see any bug for OrthogonalMooreGrid. As we can see below, there is a nice diagonal from bottom left to top right at t=0 and also at t=11.


step 0

Grid State (number of agents in each cell):
-----------------------
|  1  0  1  0  0 |
|  0  0  2  0  0 |
|  1  1  0  1  0 |
|  4  0  0  1  1 |
|  0  3  1  2  1 |
-----------------------
Total number of agents: 20

# a few steps later
Step 11

Grid State (number of agents in each cell):
-----------------------
|  1  2  1  0  0 |
|  1  0  2  0  2 |
|  1  0  0  0  3 |
|  1  0  1  0  2 |
|  0  0  0  1  2 |
-----------------------
Total number of agents: 20

quaquel avatar Sep 24 '25 12:09 quaquel

I have done some further digging, but now for HexGrid. Indeed, something goes wrong here, but the problem seems to be in how the grid is wired up when torus=true. If a grid is wired up correctly, a neighbor relationship is transitive. That is, if Cell A is a neighbor of Cell B, Cell B is also a neighbor of Cell A.

In case torus=True for HexGrid, this relation does not always seem to hold.

In the provided example, an agent moves from a cell at coordinate (3, 0) to the cell with coordinate (4,4). But if we check the original removal of the cell at coordinate (4,4), this cell has no link to cell (3,0). Thus, when removing it, it did not inform cell (3,0) of its removal.

In fact, we still get the bug if we don't add new cells back on the empty locations. This means that the cell for coordinate (4,4) to which the agent has been moved was part of the original grid and has not been fully garbage collected because cell (3,0) still held a reference to it.

TLD: the issue is the wiring of Hexgrid with torus=True, not anything to do with the caching.

update: it only happens if the grid height or width is odd. This makes perfect sense, actually, if you draw out a hex grid. So my suggested solution is to not allow torus=True on odd width or height because the wiring of the torus is not well defined in those cases.

quaquel avatar Sep 24 '25 14:09 quaquel

@EwoutH @quaquel if the issue is not addressed yet. Shall I take it up?

ShreyasN707 avatar Dec 12 '25 14:12 ShreyasN707

Please check if the issue has been addressed by inspecting the code and closed PRs. I don't recall whether this was closed or not, but the fix seems trivial based on my comment from September.

quaquel avatar Dec 12 '25 15:12 quaquel

@quaquel I checked the current code and recent closed PRs, and the issue is still present. The cache invalidation only clears radius-1 neighborhoods, so any cached get_neighborhood (radius > 1) results continue to include removed cells. I didn’t find any merged PR that changes this behavior, and the code comments still note this limitation. Based on your earlier suggestion, the simplest fix would be to avoid caching neighborhoods for radius > 1 altogether, which would prevent stale results while keeping the usual radius-1 case fast.

I can fix this issue can i go ahead?

ShreyasN707 avatar Dec 12 '25 16:12 ShreyasN707

Please reread my comment:

TLDR: the issue is the wiring of Hexgrid with torus=True, not anything to do with the caching.

update: it only happens if the grid height or width is odd. This makes perfect sense, actually, if you draw out a hex grid. So my suggested solution is to not allow torus=True on odd width or height because the wiring of the torus is not well defined in those cases.

quaquel avatar Dec 12 '25 19:12 quaquel

Please reread my comment:

TLDR: the issue is the wiring of Hexgrid with torus=True, not anything to do with the caching.

update: it only happens if the grid height or width is odd. This makes perfect sense, actually, if you draw out a hex grid. So my suggested solution is to not allow torus=True on odd width or height because the wiring of the torus is not well defined in those cases.

I took another look, and it seems the issue is still present: HexGrid allows torus=True with odd width or height, which can lead to asymmetric neighbor wiring. I couldn’t find any change addressing this so far.

ShreyasN707 avatar Dec 13 '25 13:12 ShreyasN707