lolo issues

Make linear regression numerically stable.

`LinearRegressionLearner` solves for linear coefficients a pseudoinverse, which is numerically unstable. It should be trivial to replace this with a LAPACK `dgels` or `dgelsd`.

gregor-robinson

Expose predictions made by individual trees in ensemble

It's helpful for various reasons to have access to the individual predictions made by each tree in the ensemble, in addition to the usual average over ensemble, uncertainties, etc. that...

jkoeller

Investigate using multinomial bootstrap

Bagger uses a Poisson bootstrap. This converges in probability to the ordinary multinomial bootstrap in the large data limit, but we should confirm it's a suitable approximation for our small...

gregor-robinson

Improve multi-task test coverage.

See MultiBaggerTest for an example of how multi-task learning is not as thoroughly-exercised as single-task counterparts.

gregor-robinson

Add `minDistinctLabels` to decision tree to prevent UQ collapse in Bagger

1

If the training labels have repeats of label values, then it is increasingly possible that every tree in the ensemble makes the same prediction (even if the input values are...

maxhutch

enhancement

Categorical input support for lolopy

5

I might be mistaken, but lolopy does not seem to support categorical inputs. Input of categorical features fails in utils.py with an attempted cast of X to np.float64. @WardLT If...

sesevgen

Merit class UncertaintyCorrelation yields nan for constant predictive uncertainties

When calling `UncertaintyCorrelation` with predictive distributions that have constant uncertainty, value `varSigma` ([line 131 in `Merit.scala`](https://github.com/CitrineInformatics/lolo/blob/45bc1cc0d64d8c6a726005fb8e660ee7ebd1b582/src/main/scala/io/citrine/lolo/validation/Merit.scala#L131)) is zero, leading to denominator being zero. As the numerator is also zero, not-a-number...

mrupp-citrine

Better Error Messages for Lolopy

1

The error message for lolopy when java isn't installed is: `ValueError: invalid literal for int() with base 10: b''` We should make a better error message for this issue, and...

WardLT

Turn of Parallelism on Demand

There are cases where I want to train a bagged model in serial. A constructor argument for the bagger class that turns off parallelism would be nice.

WardLT

Re-weight Gini impurity in multitask based on number of classes

The maximum value of the Gini impurity is `(n-1)/n`, where `n` is the number of classes. This could cause multitask models to be biased towards modeling multi-class labels more accurately...

maxhutch

lolo
lolo copied to clipboard

Metadata

Make linear regression numerically stable.

Expose predictions made by individual trees in ensemble

Investigate using multinomial bootstrap

Improve multi-task test coverage.

Add `minDistinctLabels` to decision tree to prevent UQ collapse in Bagger

Categorical input support for lolopy

Merit class UncertaintyCorrelation yields nan for constant predictive uncertainties

Better Error Messages for Lolopy

Turn of Parallelism on Demand

Re-weight Gini impurity in multitask based on number of classes

← Metadata

Owner

Metadata

lolo lolo copied to clipboard

Metadata

← Metadata

Owner

Metadata

lolo
lolo copied to clipboard