gregor-robinson issues

Results 6 issues of


                                            gregor-robinson

Disambiguate feature importance from get_importance_scores

Despite having a similar name, get_importance_scores is intended to capture the impact of training data, which is related but not identical to the gain scores we use to communicate feature...

Fail on invalid subsetStrategy

Currently `io.citrine.lolo.learners.RandomForest` (and `io.citrine.lolo.learners.ExtraRandomTrees`, which emulates the RF interface) defaults to automatic subset strategy selection when the parameter `subsetStrategy` is an invalid string. This is an opportunity for an unobservable...

Unbias standard deviation estimator

getStdDevMean currently uses ~a biased variance estimator~ the square root of the sample variance. This should be unbiased by replacing the denominator with ~`treePredictions.length - 1`~ `treePredictions.length - 1.5` or...

Make linear regression numerically stable.

`LinearRegressionLearner` solves for linear coefficients a pseudoinverse, which is numerically unstable. It should be trivial to replace this with a LAPACK `dgels` or `dgelsd`.

Investigate using multinomial bootstrap

Bagger uses a Poisson bootstrap. This converges in probability to the ordinary multinomial bootstrap in the large data limit, but we should confirm it's a suitable approximation for our small...

Improve multi-task test coverage.

See MultiBaggerTest for an example of how multi-task learning is not as thoroughly-exercised as single-task counterparts.