mleap icon indicating copy to clipboard operation
mleap copied to clipboard

Case Insensitivity on MLeap Models

Open Ben-Epstein opened this issue 5 years ago • 3 comments

By default, if you train a PySparkML model with a dataframe that has uppercase column names, and then try to run an inference with the same column names but in lowercase, the prediction will fail. Is there a parameter or way to set case insensitivity on inference?

I see this checking for a strict vs relaxed select of the leapframe which I assume is what I'm looking for. ~~How can I set that when serializing a PySpark Model to an MLeap Bundle?~~

Thanks!

Edit: I see that my second question was wrong - It comes from the transform function, not embedded into the Bundle itself. So when I call model.transform(frame) is there documentation on how to pass in the relaxedSelect option?

Edit: It seems like I'm incorrect on what relaxedSelect does. It seems to just "not throw an error on columns that don't exist" instead of being case insensitive. Is there a case insensitivity option?

Ben-Epstein avatar Sep 08 '20 20:09 Ben-Epstein

hey @Ben-Epstein thanks for the question, there isn't a case insensitivity option available at the moment.

i am not entirely sure if this should be added to mleap itself, i am leaning towards this being something that can be better handled in the service that calls mleap for scoring.

ancasarb avatar Jan 27 '21 21:01 ancasarb

@ancasarb thanks for the reply. The issue I am getting is that I don't necessarily know if the model was trained (and saved) with a dataframe with uppercase columns, lowercase columns or mixed columns. So at deployment, I run into issues where sometimes the case is wrong so the model fails. Spark actually has the same issue (although sklearn does not). It would be great if mleap supported that flexibility

Ben-Epstein avatar Jan 27 '21 23:01 Ben-Epstein

Maybe you can add an option to "force lower" or "force upper"? During serialization you simply force all columns in the schema lower/upper and then do the same at transform time?

Ben-Epstein avatar Jan 28 '21 14:01 Ben-Epstein