Fix issue#1768. Change model_vars[var] from List to Dictionary in datacollector.py
As discussed in the issue, changed model_vars[var] in the datacollector from list to dictionary to keep the step column in get_model_vars_dataframe() and get_agent_vars_dataframe() synchronized.
Changed the test_datacollector.py accordingly to fit it to the new dictionary structure.
One point I'd need some suggestion is,
Previously in the step_assertion function of test_datacollector.py, this line was checking if the index of the currently selected element is less than 4. But actually the index() function returns the first occurrence of an item in the list. So because the model_var list actually contains 5 consecutive 10s the index() function always returns 0 for element=10 but if i understand right, line 133 is supposed to run only for 4 iterations but due to the index() function always returning zero Line 133 runs for 5 consecutive iterations.
I assumed this was unintentional and lline 133 should actually run for 5 iterations so i wrote my test file accordingly. If my assumption was wrong or i need to make any modifications to the code please do let me know.
Performance benchmarks:
| Model | Size | Init time [95% CI] | Run time [95% CI] |
|---|---|---|---|
| Schelling | small | 🔵 -0.2% [-0.6%, +0.1%] | 🔵 -0.2% [-0.5%, +0.0%] |
| Schelling | large | 🔵 -0.3% [-1.1%, +0.5%] | 🔵 -3.3% [-5.2%, -1.9%] |
| WolfSheep | small | 🔵 -0.4% [-1.7%, +0.8%] | 🔵 -1.2% [-1.4%, -1.0%] |
| WolfSheep | large | 🔵 +0.2% [-0.6%, +1.0%] | 🔵 +1.8% [-1.2%, +4.4%] |
| BoidFlockers | small | 🔴 +692.5% [+688.0%, +697.3%] | 🔵 -1.3% [-2.2%, -0.4%] |
| BoidFlockers | large | 🔴 +699.2% [+695.1%, +703.1%] | 🔵 -1.5% [-2.1%, -1.0%] |
Thanks for this PR! It's quite a bit breaking change, because users themselves use model_vars and that now changes dataformat. That users have to add .values() is a breaking change.
We're having a big discussion about the future of the datacollector, mainly in #348. We would like to redesign it in the coming months, and then we should also include this issue. So please participate in that discussion.
(unfortunately, our time is also limited, it's still all volunteer work at this point, so it moves a little slow)
Thanks for this PR! It's quite a bit breaking change, because users themselves use
model_varsand that now changes dataformat. That users have to add.values()is a breaking change.We're having a big discussion about the future of the datacollector, mainly in #348. We would like to redesign it in the coming months, and then we should also include this issue. So please participate in that discussion.
(unfortunately, our time is also limited, it's still all volunteer work at this point, so it moves a little slow)
Got it. I'll look into the discussion
I'd say it's fine to introduce a breaking change after the 2.3 release (if the next release is going to be 3.0). There is no telling when the new data collection implementation will arrive, and even so, it will be part of mesa.experimental, not mesa.DataCollector.
@NirobNabil Just curious, did you try this change with batch_run? I don't think it will matter, but multi-processing can make things interesting.
Thanks for the PR!
@NirobNabil Just curious, did you try this change with
batch_run? I don't think it will matter, but multi-processing can make things interesting.
I haven't yet. I'll do a test at night and let you know update
While I like the direct access this PR provides to model vars, it will take up more space since model.steps is now collected multiple times, right?