Cells keep cache after removing connections (3.2.0 version)
Describe the bug
Hex Grid remove_cell doesn't clear cache on cells.
I have a simple example where I removing diagonal cells and place agents to separate them. After creating agents I added new cells at same places but without connections. When I run script, I found that some cells uses previous connection before removing cells. I can reproduce this bug at HexGrid.
Workaround
create NoCachedCell and use as cell class
To Reproduce
"""
Simple example demonstrating Mesa's OrthogonalMooreGrid with CellAgent.
This example creates a 5x5 grid where agents can move and interact with neighbors
using the Moore neighborhood (8 surrounding cells).
"""
from random import Random
from typing import Optional, List
import mesa
from mesa.discrete_space import CellAgent, OrthogonalMooreGrid, HexGrid
from mesa.discrete_space.cell import Cell, Coordinate
from mesa.discrete_space.cell_collection import CellCollection
class NoCachedCell(Cell):
@property
def neighborhood(self) -> CellCollection[Cell]:
return self.get_neighborhood()
class MoneyAgent(CellAgent):
"""An agent with fixed initial wealth that can move around the grid."""
def __init__(self, model: "MoneyModel", cell) -> None:
"""
Create a new agent.
Args:
model: Model instance
cell: The cell where this agent is located
"""
super().__init__(model)
self.cell = cell
self.prev_cell = cell
self.wealth = 1
def move(self) -> None:
"""Move to a random cell in the Moore neighborhood."""
try:
prev_cell = self.cell
self.cell = self.cell.neighborhood.select_random_cell()
self.prev_cell = prev_cell
except IndexError as e:
print(self.prev_cell.coordinate, self.cell.coordinate, [c.coordinate for c in self.prev_cell.neighborhood.cells])
raise e
def give_money(self) -> None:
"""Give 1 unit of wealth to another agent in the same cell."""
# Get all agents in the current cell except self
cellmates = [a for a in self.cell.agents if a is not self]
if self.wealth > 0 and cellmates:
other_agent = self.random.choice(cellmates)
other_agent.wealth += 1
self.wealth -= 1
class MoneyModel(mesa.Model):
"""A model with agents moving on a grid and exchanging money."""
def __init__(self, n_agents: int = 10, width: int = 5, height: int = 5, seed: Optional[int] = None) -> None:
"""
Create a new Money Model.
Args:
n_agents: Number of agents to create
width: Width of the grid
height: Height of the grid
seed: Random seed for reproducibility
"""
super().__init__(seed=seed)
self.num_agents = n_agents
# Create grid with Moore neighborhood and allow multiple agents per cell
self.grid = HexGrid(
(width, height),
torus=True, # Wrap around edges
capacity=None, # Max agents per cell
random=self.random,
#cell_klass=NoCachedCell
)
# Remove and add cells to the grid to test the grid
for i in range(width):
self.grid.remove_cell(self.grid[(i, i)])
# Create agents and place them randomly
agents = MoneyAgent.create_agents(
self,
self.num_agents,
# Randomly select cells for each agent
self.random.choices(self.grid.all_cells.cells, k=self.num_agents)
)
# Add cells to the grid to test the grid
for i in range(width):
self.grid.add_cell(self.grid.cell_klass((i, i), random=self.random))
def step(self) -> None:
"""
Advance the model by one step:
1. Have agents move randomly
2. Have agents give money to others in their cell
"""
self.agents.shuffle_do("move")
self.agents.do("give_money")
def print_grid_state(model: MoneyModel) -> None:
"""
Print a simple text representation of the grid.
Args:
model: The model to display
"""
print("\nGrid State (number of agents in each cell):")
print("-" * (model.grid.width * 4 + 3))
num_agents = 0
for y in range(model.grid.height-1, -1, -1):
print("|", end=" ")
for x in range(model.grid.width):
cell = model.grid[x, y]
print(f"{len(cell.agents):2d}", end=" ")
num_agents += len(cell.agents)
print("|")
print("-" * (model.grid.width * 4 + 3))
print(f"Total number of agents: {num_agents}")
def main():
"""Run a demonstration of the money model."""
# Create model with 20 agents on a 5x5 grid
model = MoneyModel(n_agents=20, width=5, height=5, seed=42)
# Run for 10 steps
for step in range(100):
print(f"\nStep {step}")
print_grid_state(model)
# Print wealth stats
wealth_dist = [agent.wealth for agent in model.agents]
print("\nWealth Distribution:")
print(f"Min: {min(wealth_dist)}, Max: {max(wealth_dist)}, "
f"Average: {sum(wealth_dist)/len(wealth_dist):.2f}")
# Step the model
model.step()
if __name__ == "__main__":
main()
Thanks @MalchuL!
@quaquel Bringing this to your attention/ any thoughts?
@quaquel could you reproduce this issue?
I haven't looked at it yet. I missed it over the summer break.
I had a quick look. I don't see any bug for OrthogonalMooreGrid. As we can see below, there is a nice diagonal from bottom left to top right at t=0 and also at t=11.
step 0
Grid State (number of agents in each cell):
-----------------------
| 1 0 1 0 0 |
| 0 0 2 0 0 |
| 1 1 0 1 0 |
| 4 0 0 1 1 |
| 0 3 1 2 1 |
-----------------------
Total number of agents: 20
# a few steps later
Step 11
Grid State (number of agents in each cell):
-----------------------
| 1 2 1 0 0 |
| 1 0 2 0 2 |
| 1 0 0 0 3 |
| 1 0 1 0 2 |
| 0 0 0 1 2 |
-----------------------
Total number of agents: 20
I have done some further digging, but now for HexGrid. Indeed, something goes wrong here, but the problem seems to be in how the grid is wired up when torus=true. If a grid is wired up correctly, a neighbor relationship is transitive. That is, if Cell A is a neighbor of Cell B, Cell B is also a neighbor of Cell A.
In case torus=True for HexGrid, this relation does not always seem to hold.
In the provided example, an agent moves from a cell at coordinate (3, 0) to the cell with coordinate (4,4). But if we check the original removal of the cell at coordinate (4,4), this cell has no link to cell (3,0). Thus, when removing it, it did not inform cell (3,0) of its removal.
In fact, we still get the bug if we don't add new cells back on the empty locations. This means that the cell for coordinate (4,4) to which the agent has been moved was part of the original grid and has not been fully garbage collected because cell (3,0) still held a reference to it.
TLD: the issue is the wiring of Hexgrid with torus=True, not anything to do with the caching.
update: it only happens if the grid height or width is odd. This makes perfect sense, actually, if you draw out a hex grid. So my suggested solution is to not allow torus=True on odd width or height because the wiring of the torus is not well defined in those cases.
@EwoutH @quaquel if the issue is not addressed yet. Shall I take it up?
Please check if the issue has been addressed by inspecting the code and closed PRs. I don't recall whether this was closed or not, but the fix seems trivial based on my comment from September.
@quaquel I checked the current code and recent closed PRs, and the issue is still present. The cache invalidation only clears radius-1 neighborhoods, so any cached get_neighborhood (radius > 1) results continue to include removed cells. I didn’t find any merged PR that changes this behavior, and the code comments still note this limitation. Based on your earlier suggestion, the simplest fix would be to avoid caching neighborhoods for radius > 1 altogether, which would prevent stale results while keeping the usual radius-1 case fast.
I can fix this issue can i go ahead?
Please reread my comment:
TLDR: the issue is the wiring of Hexgrid with torus=True, not anything to do with the caching.
update: it only happens if the grid height or width is odd. This makes perfect sense, actually, if you draw out a hex grid. So my suggested solution is to not allow torus=True on odd width or height because the wiring of the torus is not well defined in those cases.
Please reread my comment:
TLDR: the issue is the wiring of Hexgrid with torus=True, not anything to do with the caching.
update: it only happens if the grid height or width is odd. This makes perfect sense, actually, if you draw out a hex grid. So my suggested solution is to not allow torus=True on odd width or height because the wiring of the torus is not well defined in those cases.
I took another look, and it seems the issue is still present: HexGrid allows torus=True with odd width or height, which can lead to asymmetric neighbor wiring. I couldn’t find any change addressing this so far.