activitysim icon indicating copy to clipboard operation
activitysim copied to clipboard

Keep preprocessor variables, drop unneeded columns in chooser table

Open jpn-- opened this issue 1 year ago • 1 comments

In components where the chooser table is copied and/or merged with other data tables (e.g. interaction-simulate and interaction-sample-simulate), we would like the option to copy ONLY the required columns/variables. Doing so should greatly reduce memory requirements when there are many un-used columns.

There are two potential approaches to this:

  • Manual. Allow the user to manually specify columns to keep or columns to drop. This has already been implemented in a handful of particularly problematic components, but having a more generic / widely applicable interface for this capability would be better.
  • Automatic. Scan the specification file and have the program decide in advance of copy/merge what columns will be needed.

Some concerns and complications:

  • Tracing. It may be desirable for an analyst to have access to all variables in a trace file, not just the retained variables.
  • Non-static variable names. It might be difficult to extract all variable name references, especially if some variable names are constructed programmatically ("on the fly") inside the model spec file, instead of appearing as a literal string.
  • Estimation mode. It is desirable for an analyst to have access to all variables in estimation mode.

jpn-- avatar Feb 06 '24 19:02 jpn--

Possible complications:

  • Need all the columns in estimation mode
  • What if variables names are created "on the fly"
  • What about tracing?

jpn-- avatar Feb 06 '24 19:02 jpn--