fingertipsR
fingertipsR copied to clipboard
predict_indicator()
A function that predicts the next year value of an indicator. predict_indicator():
- IndicatorID
- R asks for area type for prediction (eg, UTLA), then:
- Extracts all indicator data for indicators in same profile(s) at the same geography
- Identifies latest year for target indicator
- Subsets dataframe of latest year information for all indicators
- Creates flat, wide table of remaining indicators with variables for each previous year of data available for each indicator (eg, indicator_x_1yr_previous, indicator_x_2yr_previous, … , indicator_x_nyr_previous)
- Trains and tests model on second latest year for target indicator (maybe multiple machine learning methods)
- Uses best model to predict next year of data for indicator
- Lasso
- Glm
- Svm
- Randomforest
https://github.com/julianflowers/Data-science/blob/master/scripts/get_sui_data.R https://github.com/julianflowers/Data-science/blob/master/suicide_prediction2.Rmd
This is not so much forecasting but prediction (subtle I know) but prediction seems to be about fitting values to unseen data, forecasting about the future. To forecast next year we would need to be able to estimate all the model inputs as well...
Have been trying a few other models - xgboost, gbm, brnn...
xgboost seems to be very popular - a bit fiddly brnn is a bayesian neural network which seems quite accurate
This looks really good. I'm starting to think this belongs to a different package. This package has been reviewed by some rOpenSci reviewers and one of the comments is to reduce dependencies on other packages. That is a good suggestion and helps draw the boundaries around the limits of this package. I think we need to start developing the insights package internally...