frrsa icon indicating copy to clipboard operation
frrsa copied to clipboard

Improve computation of final betas

Open PhilippKaniuth opened this issue 1 year ago • 2 comments

Possibly change how optionally returned final betas are computed to make them more useful for downstream analyses. For that, change frrsa/frrsa/fitting/fitting/final_model so that it does a repeated CV to find the best hyperparameter for the whole dataset. Unclear how to deal with several hyperparams i.e. whether averaging them is actually sensible. The best course of action might also depend on the kind of hyperparameter, i.e. whether one uses fractional ridge regression or the more classical approach as sklearn does (this depends on how one sets the parameter nonnegative).

PRs for this issue welcome.

PhilippKaniuth avatar Mar 20 '23 14:03 PhilippKaniuth

For context: currently, the optionally returned betas are computed using the whole dataset. However, in order to do so, one needs the best hyperparameter for the whole dataset (for each target separately, if one has more than one). This best hyperparameter is currently only ad-hoc calculated by averaging the hyperparameters from all outer cross-validations. Note that the hyperparameter from an outer cross-validation is only based on a sub-subset of the whole data (as it's estimated in the inner cross-validation), not guaranteeing to yield the actual best hyperparamter for the whole dataset.

PhilippKaniuth avatar Mar 27 '23 12:03 PhilippKaniuth

~If the aim is to generalize betas to a second target their might be a (more feasible) alternative:~

~The user submits two (or several) targets. One of them will be used to fit the statistical model. The others will be simply correlated with the reweighted predicting matrix, essentially generalizing the model. This should likely be done in the func fit_and_score.~

moved to its own issue #46

PhilippKaniuth avatar Mar 29 '23 14:03 PhilippKaniuth