populationsim icon indicating copy to clipboard operation
populationsim copied to clipboard

Multiprocessing Start_time Error

Open bettinardi opened this issue 1 year ago • 0 comments

From Stefan Coe - Quick question. Are you running populationsim with multiprocessing? If so, which version of Activitysim are you using in your python environment? I am using version 1.2.1 and am getting an error when running the calm example with mp:

“UnboundLocalError: local variable 'start_time' referenced before assignment.

From Nick Fournier - Oh, I know exactly the error you found.

TLDR: There's a line deep in the activitysim code where the system time (start_time) should be set a few lines earlier above an if-statement that otherwise gets skipped when run from populationsim. This was introduced when "sharrow" was added. I have an open pull request fixing the bug, but there is a lot of flux in activitysim so I am not sure if/when it'll be included.

As a workaround I use an earlier version of ActivitySim == 1.1.3. That should resolve things for you. I started running into a variety of other bugs related to numpy/pandas when I tried using blockgroup IDs as zone IDs, which require integer64 vs simple int32. So I started my own forked repo to keep track of bugfixes and added some minor performance enhancements along the way. Feel free to try it out, but full disclosure I haven't done any testing to make sure every example still works, but it should? https://github.com/nick-fournier-rsg/populationsim

In regard to multiprocessing at large, it can be really unforgiving in PopulationSim. I try to avoid it unless absolutely necessary. One issue I ran into is making sure that the proper tables are listed under "coalesce" so that they are recombined after being split. Otherwise, you wind up with a bunch of duplicate rows and NAs.

bettinardi avatar Oct 12 '23 19:10 bettinardi