gregor-robinson
gregor-robinson
Despite having a similar name, get_importance_scores is intended to capture the impact of training data, which is related but not identical to the gain scores we use to communicate feature...
Currently `io.citrine.lolo.learners.RandomForest` (and `io.citrine.lolo.learners.ExtraRandomTrees`, which emulates the RF interface) defaults to automatic subset strategy selection when the parameter `subsetStrategy` is an invalid string. This is an opportunity for an unobservable...
getStdDevMean currently uses ~a biased variance estimator~ the square root of the sample variance. This should be unbiased by replacing the denominator with ~`treePredictions.length - 1`~ `treePredictions.length - 1.5` or...
`LinearRegressionLearner` solves for linear coefficients a pseudoinverse, which is numerically unstable. It should be trivial to replace this with a LAPACK `dgels` or `dgelsd`.
Bagger uses a Poisson bootstrap. This converges in probability to the ordinary multinomial bootstrap in the large data limit, but we should confirm it's a suitable approximation for our small...
See MultiBaggerTest for an example of how multi-task learning is not as thoroughly-exercised as single-task counterparts.