sklearn-onnx
sklearn-onnx copied to clipboard
Support for TransformedTargetRegressor
Hi,
Is there any plan to introduce support TransformedTargetRegressor
?
For the record, my use case is the following:
TransformedTargetRegressor(estimator, func=np.log, inverse_func=np.exp)
Let me talk to the team about this and get back. Thanks! Also, is this a high priority item for you?
@prabhat00155 hello, great to know. In terms of priority, it means I can't use onnx as a way to serialize my model, or that I need to move the log/exp transformation in my code.
I will describe the use case in more details, it will make things clearer about the importance of something like TransformedTargetRegressor
for me and maybe will help detect that I'm solving my needs with the wrong tools? :)
I'm using onnx so that I can have a bit of compatibility between different versions of my models. The objective is that I can switch the way the model is learned but keep the same input/output interface for the services that load and predict with the model.
For example for now I'm using TransformedTargetRegressor
as a way to reduce the impact of outliers in my labels (I'm using xgboost regressor squared error objective function), but tomorrow when xgboost 1.0.0 is released, maybe I will solve this same problem using a squared log error objective function that was just introduced and can be used for reducing the impact of outliers.
In terms of engineering, if I can avoid changing the way the models are used everytime I change the way the models are built, it's a good thing for me :)
Your model requires the inverse function to be converted into ONNX and that's not always possible. There are multiple options to do that, have the runtime run some python code, let the user specify the onnx graph of the inverse function, write a function which automatically converts the custom function into ONNX. In your case, it would work because the function is simple. This is a direction I started to investigate (http://www.xavierdupre.fr/app/mlprodict/helpsphinx/mlprodict/onnx_grammar/onnx_translation.html#mlprodict.onnx_grammar.onnx_translation.translate_fct2onnx) but I'm not fully happy with the current state.
@xadupre good points, I didn't think enough to realise it wasn't so obvious :)
My objective would be to avoid the python code solution, but I think that specifying the inverse function to onnx during conversion would be acceptable yes.
Of course the automatic conversion of most known function is the best, and as you say, it would make sense that all the numpy available function with a corresponding onnx function could be supported easily.
I create another approach. It is a numpy API for ONNX. If you write your function with this api, the conversion is free.
@xadupre this very nicely solve the problem, thanks a lot. I won't be able to test it right now, but I think we can consider this issue solved, and if there is a problem later I can always open a new ticket :)
@xadupre I tried your numpy API for ONNX and ran into this issue when using it together with the TransformedTargetRegressor as victornoel showed above.
MissingShapeCalculator: Unable to find a shape calculator for type '<class sklearn.compose._target.TransformedTargetRegressor'>'.
When using it with the FunctionTransformer as in your example things work fine. So I was wondering, whether you ever tested it with the TransformedTargetRegressor?
The following page indicates the list of supported models: https://onnx.ai/sklearn-onnx/supported.html. There is no converter for it right now. So if you implemented a new converter, you need to register it: https://onnx.ai/sklearn-onnx/auto_tutorial/plot_kcustom_converter_wrapper.html (function update_registered_converter). If that's not the case, I'll need more information on the predictor inside the transformer to see if it is easy to do.
A new project https://github.com/microsoft/onnx-script has started to help create onnx models. It might be easier to use. i did not write example with sklearn-onnx but it should be soon.
Thank you for the directions to look into. I did not yet implement a converter only the function and inverse function as described in your example (http://www.xavierdupre.fr/app/mlprodict/helpsphinx/blog/2021/2021-05-05_numpyapionnx1.html). I probably misunderstood your sentence
If you write your function with this api, the conversion is free.
My simple test employed sklearn's LinearRegression model but actually I want to use it with a GradientBoostingRegressor and ExtraTreesRegressor. As I understand from the page of supported models all of these are supported.
I need to know more about the model your need to convert. Your scenario is still the same as the one mentioned at the beginning of the issue? What about the one with the linear regression?
Yes, I use a TransformedTargetRegressor and log(x+1) + inverse function:
@onnxnumpy_default
def onnx_log_1(x: NDArray[Any, np.float32]) -> NDArray[(None, None), np.float32]:
return npnx.log1p(x)
@onnxnumpy_default
def onnx_exp_1(x: NDArray[Any, np.float32]) -> NDArray[(None, None), np.float32]:
return npnx.exp(x) - np.float32(1)
model = TransformedTargetRegressor(regressor=LinearRegression(), func=onnx_log_1, inverse_func=onnx_exp_1)
Other than this I basically stuck to your code here http://www.xavierdupre.fr/app/mlprodict/helpsphinx/blog/2021/2021-05-05_numpyapionnx1.html
When it comes to calling to_onnx
it fails stating:
MissingShapeCalculator: Unable to find a shape calculator for type '<class 'sklearn.compose._target.TransformedTargetRegressor'>'.
I see. I'll complete the example to make it work.
I guess that was not that obvious but if you install the latest version of mlprodict, you should be able to run the following example https://github.com/sdpython/mlprodict/blob/master/_doc/examples/plot_converters.py which covers your need.