cobra
cobra copied to clipboard
Warn user when trying to preprocess unsupported datatypes
I often get questions about the exotic errors thrown when users call Cobra's preprocessor. The error messages are too exotic for developers to understand that the datatypes they are passing are not supported. Most of the time this is just datetime & panda's categorical datatypes. Best would be to throw a warning at the start of the preprocessing if any variables are of the above types, before jumping into the actual preprocessing. Other thing to look into is if we can't start supporting panda's categorical datatype.
Could you explain in more detail when exactly this error occurs ? I tried to put datetime and categorical datatypes in the PreProcessor as categorical variables but I did not get an error message. However, I get an error byI preprocessing the datetime variable as continuous variable but the error is quite clear to me so I do not see the need of changing it. TypeError: must be real number, not Timestamp
Hey Patrick!
Mainly when people think to be smart to use panda's dtype "category" (https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html#series-creation) instead of just a string-typed column, I remember Cobra crashed.
I never tried passing datetimes as a categorical variable, good though if it works. We however got this question from people in the past though because they tried passing all variables as continous variables. It must be though that this has been added into Cobra already in the mean time, the error used to be cryptic for this, some failure "deeper down" got thrown.
(Additional thought, up for debate, we could still suggest Cobra users to consider creating derived features for datetimes (e.g. "months since..." etc), many times that is needed to create good models rather than categorical variables. Although this risks of sounding pedantic to our users.)
I'm only writing this for in case you're not taking holidays and don't have a project. Otherwise, I wish you nice partying and have a great holiday period!
Sander