urbansim_templates icon indicating copy to clipboard operation
urbansim_templates copied to clipboard

Fix bug with out_column inference in OLSRegressionStep

Open smmaurer opened this issue 6 years ago • 0 comments

OLSRegressionStep uses the dependent variable from the model expression as the out_column (destination for predicted values) if none is specified. There's a bug where we're not stripping inline transformations, causing Pandas to crash when it looks for a column named something like 'np.log1p(rent_sqft)'.

Possible solutions:

a. Fix the inference so that it gets the right column name b. Don't save fitted values to Orca if there isn't an out_column set

One problem with the automatic inference is the risk that people will accidentally overwrite their estimation data, so I'm leaning toward (b). In production models, out_column is usually the same as the dependent variable, but for model development it will usually be different. Probably better to make it explicit.

Fixing this can be paired with saving predicted values in the model object for interactive use.

smmaurer avatar Jul 18 '18 00:07 smmaurer