Randy Olson
Randy Olson
Filed as an enhancement request. To future readers: please comment here if you'd like to take this issue on.
I like this idea. Before we implement it I want to consider how this might affect other input values that aren't percentages. For example, what if the user passes a...
Looking forward to it! 👍
Can you please provide a minimal reproducible example of this error? Are you able to provide a copy of `test2.csv`?
Sounds promising. Please submit a PR with the new functionality along with unit tests to demonstrate how it works.
Indeed, which is why I'm trying to discover how to identify ordinal vs. continuous variables. I posted [this question](http://stackoverflow.com/questions/35826912/what-is-a-good-heuristic-to-detect-if-a-column-in-a-pandas-dataframe-is-categori) on StackOverflow to brainstorm.
You should be able to use the sklearn `OneHotEncoder` to get the equivalent of the pandas `get_dummies()`.
See the docs you linked and the `categorical_features` parameter.
Running `autoclean` multiple times might be the easier solution. Might be a useful extension to autocleaner to allow the user to pass multiple preprocessors in a list.
@adrose, do you mean via model-based imputation?