activitysim potential performance improvements

trafficstars

This issue is for keeping track of potential performance improvement ideas:

reduce expression solving time and memory needs by better handling string data as pandas categorical data
improve parallelization by taking advantage of updates to Python 3’s multiprocessing library
continue to improve chunksize calculations for more optimized multiprocessing setups
review ct-ramp and daysim performance ideas

Please add other ideas, thanks

Feb 04 '21 05:02 bstabler

A configuration file switch that can disable trip-level processing for tours based on tour mode. So, you can shut off (skip) stop_frequency, trip_purpose, trip_destination, trip_scheduling, and trip_mode_choice for walk and bike tours if you don't care about those trips (e.g. inside a global feedback loop iteration, I don't care about walk or bike trips as they don't impact congestion).

Bonus points: the ability to easily flop the switch the other way, and re-start only the filtered tours (e.g. I decided I finished doing all my global feedback loops and I want those non-motorized trips back now)

Feb 05 '21 02:02 jpn--

@stefancoe - add reading skim data from disk on-demand as opposed to reading every skim into RAM at the start as a way to trade runtime for RAM. @toliwaga implemented an undocumented version of this during the TVPB caching research and it runs slower but uses a lot less RAM. We may want to complete this feature for general use.

Feb 10 '21 00:02 bstabler

Some more ideas from discussions with SANDAG:

Move from strings to factors
Exponentiate ahead of time TAP to TAP utilities, along with pre-computing access/egress costs
Smarter binary search / picking of an alternative from a large choice set (such as for location choice)
Make trip destination (i.e. intermediate stop location choice) aware of the tour mode so: o For bike, walk, transit to reduce the set of possible mazs ahead of time o For auto, to pre-compute TAZ to TAZ total utilities to avoid duplication of calculations
Smarter chunking calculations to get more throughput #406
Continued expression review/tidying up to reduce redundancy of calculations (i.e. optimization of written expressions)
Buy a bigger / faster server and test ahead of time in the cloud what’s possible with respect to runtime reductions

Apr 27 '21 15:04 bstabler

Some good ideas here to increase pandas performance. The Pandas eval function looks interesting. Could it replace/substitute python eval in some cases?

Apr 27 '21 16:04 stefancoe

Could it replace/substitute python eval in some cases?

Not that we couldn't do it more, but we're already using pandas.eval in several places, for example:

https://github.com/ActivitySim/activitysim/blob/bcdc7b63d4ff7bc2703810e226090c75c380bda4/activitysim/core/interaction_simulate.py#L146
https://github.com/ActivitySim/activitysim/blob/bcdc7b63d4ff7bc2703810e226090c75c380bda4/activitysim/core/simulate.py#L443

Apr 27 '21 17:04 jpn--

Oh good to know-Thanks for pointing that out!

Apr 27 '21 17:04 stefancoe

activitysim activitysim copied to clipboard

potential performance improvements

activitysim
activitysim copied to clipboard