activitysim icon indicating copy to clipboard operation
activitysim copied to clipboard

Trace files from the same run should all have the same hash

Open dhensle opened this issue 1 year ago • 7 comments

Is your feature request related to a problem? Please describe. Trace files currently written out have a unique hash applied to them. This allows for multiple runs to be saved and compared. However, it can be hard to keep track of which trace file goes with which run: image

Describe the solution you'd like All trace files from the same run should use the same hash.

dhensle avatar Oct 02 '24 22:10 dhensle

@dhensle There already exists a RunID hash that looks like it supposed to be used to suffix trace files .

Guessing that the issue is that when trying to write trace files, the RunID is either unavailable or the state.filesystem.get_trace_path method is using a default time-based hash for some other reason. But let me know if I am barking up the wrong tree here.

andkay avatar Sep 04 '25 13:09 andkay

@andkay From my read of the existing runID creation code, it looks like the hash is created using the current time. Does this get created only once at the start of the run? What happens in mp mode -- will each subprocess have a different time?

I don't think you are barking up the wrong tree, but I am not too familiar with the details of how the current tracing hash was originally implemented.

dhensle avatar Sep 04 '25 15:09 dhensle

It's definitely hashing the time, but my assumption would certainly be that something called a "RunID" should be created once and be stable across subprocesses. In any case, sounds like I am correct that this is the first port of call to investigate.

Either the RunID isn't actually uniquely identifying a run based on the launch time -- or there is something going on with the tracing that doesn't feed this information through in which case a different hash based on the current time will be created when the trace file path gets created automatically.

andkay avatar Sep 04 '25 16:09 andkay

I've been looking into this some over the past couple of days. I initially tried running the prototype_mtc example as is (tracing an 11-person household) and all of the trace files had the same ID. I turned on multiprocessing and set it to use 2 processes, and each process appeared to have its own ID. I looked into it a bit further, and the ID is effectively generated when the State object is initialized, which happens at the start of the run as well as when new processes are initialized. Right now I'm thinking the approach would be to just generate a run ID at the start of the run and then save it in memory with other settings, attributes, etc, and have the tracing functions access that (along with something indicating the process name).

JoeJimFlood avatar Oct 21 '25 22:10 JoeJimFlood

@jpn-- Here's a bit of a summary of what I've tried and where I'm stuck:

I initially ran the prototype_mtc example with tracing on and saw that every trace file had the same hash. I then turned multiprocessing and saw that there were different hashes, but everything from within the same subprocess had the same hash. This suggested to me that the Run IDs were being generated when each process was being spun up. I looked through the code and saw that the tracing object is created as an attribute of the state object when the latter is generated. I then looked through and found that a new state object is created for each new subprocess, confirming what I had suspected.

My idea for a solution was to move the generation of the Run ID to cli/run.py and have one Run ID generated that the main state and any state generated for a subprocess would use. The way I tried implementing this (visible in the fix-trac-id branch of the CS fork in the code) was to then create a run_id attribute as a None type when the state was first initialized, add it to the main state, and then register it as an injectable so that the subprocess states could use it. I saw that the tracing object has an _obj attribute that references the state to which the tracing instance belongs, so I edited the tracing functions to reflect that. I was able to confirm (by adding logging messages) that the main state and subprocess states indeed were being given the same run ID. However, all of the tracing files had different hashes. I added some more logging messages to see if the tracing objects were getting the global run ID and they all had values of None for the run ID, resulting in a random hash each time.

My implementation results in the following steps happening (which may be an oversimplification of what's actually happening):

  1. The state object is initialized.
  2. The state is given a tracing object which is then initialized.
  3. The state object is given a run ID.

It would appear as though the tracing object is not accounting for its parent state having an updated run ID. I have tried moving around the order of which various things happen, but that has always resulted in the hashes all still being different or the model crashing due to something else being done in the wrong order. I do see that there are the state accessor functions (that the Run ID was initially using) so maybe I need to try something with that, but I am completely stumped about what exactly is going on under the hood. I also noticed that the state object can be initialized with some context, but that's not being done in the subprocesses, so would something like that resolve the issue?

JoeJimFlood avatar Oct 30 '25 20:10 JoeJimFlood

@JoeJimFlood I pushed a commit to your branch fix-trace-id that may fix it. The change should use the original system to create run_id values, and pass them to the subprocesses. Can you (a) test this, and (b) write a unit test that confirms it works correctly? Thanks

jpn-- avatar Nov 13 '25 17:11 jpn--

@jpn-- (a) The fix seems to have worked on my end. Thanks! (b) I am starting to think about the best way to go about doing that.

JoeJimFlood avatar Nov 13 '25 18:11 JoeJimFlood