populationsim
populationsim copied to clipboard
Recommended way to configure model year?
Hi there - I've been playing with PopulationSim for our use (https://github.com/BayAreaMetro/populationsim) and so far, I'm testing it for 2010 by specifying the 2010 control files (for example, https://github.com/BayAreaMetro/populationsim/blob/master/bay_area/households/configs/settings.yaml#L61)
But our standard practice will be to run it for multiple years but I don't like the obvious solutions:
- having duplicates of the config that looks very similar with just the year changed
- having a single config but copying/moving files around at runtime
I'd prefer something like having the settings.yaml file have something like
model_year: 2010
and then
- tablename: MAZ_control_data
filename : %model_year%_mazData.csv
How do you recommend folks handle this? Thank you!
I agree that this would be a handy feature. It probably makes sense to use anchors and tags to implement this feature.
Fortunately you don't have to wait. You can define a join method and register it as a tag handler with yaml globally, and it will be available to you in all your yaml files. Try the following:
Add this to the import section at the top of run_populationsim.py
import yaml
Put the following before any other executable code in run_populationsim.py (i.e. before the handle_standard_args() call) to install the yaml tag handler
## define custom tag handler
def join(loader, node):
seq = loader.construct_sequence(node)
return ''.join([str(i) for i in seq])
yaml.add_constructor('!join', join)
Now, in settings.py, you can define an anchor for model_year, and use the !join tag to concatenate it into other strings:
current_model_year: &MODEL_YEAR '2010'
TEST_JOIN:
- tablename: !join [*MODEL_YEAR, _mazData.csv]
other_stuff: stuff
You can test this by doing this at somewhere near the beginning of run_populationsim.py
print "TEST_JOIN:", setting('TEST_JOIN')
exit()
And it should print out:
TEST_JOIN: [{'tablename': '2010_mazData.csv', 'other_stuff': 'stuff'}]
You could avoid defining a custom tag by simply doing something like:
current_model_year: &MODEL_YEAR '2010'
TEST_JOIN:
- tablename: !!python/object/apply:string.join [[*MODEL_YEAR, _mazData], '']
other_stuff: stuff
but this introduces yet another python3 compatibility issue. Plus it isn't very readable.
Thanks for the quick feedback Jeff. I also wanted to mention that ODOT will need to think about this question as well. Our next step in the contract is to implement PopulationSim in our Statewide Integrated Model (SWIM) which runs a synthetic population each year for ~30 years (2010-2040) in an automated way into the future. So ODOT could think about this as a a more flexible run structure with the inputs and output folders being specified more flexibility and dynamically.
If we work out a flexible "data" and "outputs" location and naming process. Then one should be able to setup a yaml that works through a series of changing input and output locations with just one config file...
Yes - the next contract step will be a good opportunity to come up with a standard way of dealing with this common situation.