gs-quant icon indicating copy to clipboard operation
gs-quant copied to clipboard

Regression in PlotToolPro

Open zoeyzhougs opened this issue 4 years ago • 7 comments

Describe the problem. Regression should be available in PlotToolPro.

Describe the solution you'd like Add a class in gs-quant (constructor in PlotToolPro) for the model fitting which has methods to get model parameters and perform prediction.

For example, linear regression should take the form LinearRegression(X, y, w) where X denotes explanatory variables [x1, x2, x3, ...], y is the dependent variable and w is the moving window. The class has methods: coefficient(i) - the i-th coefficient rsquare() - model R squared tstat(i) - t-stat of the i-th coefficient fitted_values() - fitted values predict(X') where X' is the test set of explanatory variables - predicated values (Note: if multiple regression models have been fitted over the moving window, the most recent model will be used for prediction)

In PlotToolPro, users could fit a linear regression between changes of implied volatility of S&P 500 index and returns on the underlying by:

spx = returns(SPX.spot(), 1, simple) spx_vol = diff(SPX.implied_volatility(1m, forward, 100), 1) r = LinearRegression([spx], spx_vol, 60) r.fitted_values()

Are you willing to contribute Yes

zoeyzhougs avatar Feb 24 '20 06:02 zoeyzhougs

Looks good to me, should we do r_square and t_stat just to be consistent with other methods.

what does an example with predict look like?

francisg77 avatar Mar 04 '20 13:03 francisg77

Agreed on the namings. For now it's hard to use the predict function as we don't have functions to easily split data into train and test subsets. One example that needs improve:

spx = returns(SPX.spot(), simple) spx_vol = diff(SPX.implied_volatility(1m, forward, 100), 1) spx_train = interpolate(spx, [2020-02-01, 2020-02-02, 2020-02-03]) spx_vol_train = interpolate(spx_vol, [2020-02-01, 2020-02-02, 2020-02-03]) spx_test = interpolate(spx, [2020-02-04, 2020-02-05]) r = LinearRegression([spx_train], spx_vol_train) r.predict(spx_test)

zoeyzhougs avatar Mar 05 '20 11:03 zoeyzhougs

renaming r_square to r_squared

zoeyzhougs avatar Mar 13 '20 03:03 zoeyzhougs

Agreed on the namings. For now it's hard to use the predict function as we don't have functions to easily split data into train and test subsets. One example that needs improve:

spx = returns(SPX.spot(), simple) spx_vol = diff(SPX.implied_volatility(1m, forward, 100), 1) spx_train = interpolate(spx, [2020-02-01, 2020-02-02, 2020-02-03]) spx_vol_train = interpolate(spx_vol, [2020-02-01, 2020-02-02, 2020-02-03]) spx_test = interpolate(spx, [2020-02-04, 2020-02-05]) r = LinearRegression([spx_train], spx_vol_train) r.predict(spx_test)

I'm a student and have been working in ML for a while now. I'm a new contributor here. We could use the train_test_split function from sklearn library for easily splitting the training and test datasets. It'll help with the predict function. I would've done it but am not able to find PlotToolPro.

Atharva-Peshkar avatar May 10 '20 15:05 Atharva-Peshkar

@Atharva-Peshkar thanks for the input. we can run an example which is pure python, you don't need access to the PlotTool Pro application as the functions being discussed are exposed in the timeseries module in GS Quant. Give us a shout if you are keen to extend and we can give some guidance and add a tutorial

andrewphillipsn avatar Jun 09 '20 02:06 andrewphillipsn

@andyphillipsgs Yeah! sure I'd be glad to contribute.

@Atharva-Peshkar thanks for the input. we can run an example which is pure python, you don't need access to the PlotTool Pro application as the functions being discussed are exposed in the timeseries module in GS Quant. Give us a shout if you are keen to extend and we can give some guidance and add a tutorial

Atharva-Peshkar avatar Jun 09 '20 10:06 Atharva-Peshkar

It might be an idea to open source plot tools.

sirinath avatar Dec 14 '21 04:12 sirinath