urbansim_templates
urbansim_templates copied to clipboard
Fix bug with out_column inference in OLSRegressionStep
OLSRegressionStep uses the dependent variable from the model expression as the out_column
(destination for predicted values) if none is specified. There's a bug where we're not stripping inline transformations, causing Pandas to crash when it looks for a column named something like 'np.log1p(rent_sqft)'.
Possible solutions:
a. Fix the inference so that it gets the right column name
b. Don't save fitted values to Orca if there isn't an out_column
set
One problem with the automatic inference is the risk that people will accidentally overwrite their estimation data, so I'm leaning toward (b). In production models, out_column
is usually the same as the dependent variable, but for model development it will usually be different. Probably better to make it explicit.
Fixing this can be paired with saving predicted values in the model object for interactive use.