py-earth icon indicating copy to clipboard operation
py-earth copied to clipboard

Extrapolation Errors when predicting

Open Fish-Soup opened this issue 7 years ago • 2 comments

Hi I am currently trying to forecast a the power usage of a portfolio. This portfolio is made up of 1000 sites. I have fitted 1000 MARS models to this data, regularized using Elastic Net. Normally this works well.

Sometimes I have missing data over parts of my phase space. This means that sometimes the MARS model produces ridiculously numbers (sometimes) over this range. MARS can obviously extrapolate with a polynomial and the model will be poorly constrained in this region.

As I am aggregating the 1000 models the chance of this happening in one of them isn't insignificant. Currently I look at the min and max y values and limit the output by some multiple of this. This stops a site producing ridiculous numbers but smaller errors are hard for me to see.

Do we have any way of testing how volatile MARS is at a given point in phase space? I'm also not totally sure what I would do with the prediction if i found this to be true. Any ideas

Many thanks

Simon

Fish-Soup avatar Jun 07 '18 07:06 Fish-Soup

@Fish-Soup If you want to quantify volatility, you can probably come up with something based on the predict_deriv method of the fitted Earth model. It returns the gradient of your model at whatever points you pass in. There's a usage example for predict_deriv here: https://contrib.scikit-learn.org/py-earth/auto_examples/plot_derivatives.html#sphx-glr-auto-examples-plot-derivatives-py

jcrudy avatar Jun 07 '18 16:06 jcrudy

Cheers thanks for that.

Fish-Soup avatar Aug 15 '18 18:08 Fish-Soup