lolo issues

Use of lolopy RF regressor in Anaconda

Hi all, I'm interested on applying this RF regressor. I have succesfuly used the RF regressor from scikit learn with my training and testing data set. I have tried to...

rortague

How to save the fitted RandomForestRegressor model?

2

How can I save a lolopy model?" I tried to train a model like this: ``` from lolopy.learners import RandomForestRegressor model = RandomForestRegressor() model.fit(X, Y) ``` After that, I attempted...

heaynking

Training a RandomForestRegressor with boolean feature(s) results in model with no signal

On training a lolopy `RandomForestRegressor` learner using features that include a feature of type `numpy.bool_`, the resulting model has no signal. Removing the boolean feature or converting it into `numpy.int_`...

hegdevinayi

bug

Splittable random numbers for reproducible training

3

[Bagger](https://github.com/CitrineInformatics/lolo/blob/main/src/main/scala/io/citrine/lolo/bags/Bagger.scala#L101-L110) and [MultiTaskBagger](https://github.com/CitrineInformatics/lolo/blob/main/src/main/scala/io/citrine/lolo/bags/MultiTaskBagger.scala#L51-L56) both train the individual models in parallel. Because the order of training is uncontrolled, this means that Lolo random forests are inherently non-reproducible, even if the bagging...

bfolie

Not able to reproduce results.

1

In the latest lolopy version (1.2.0), I fixed random_seed but still, results are not reproducible (I have already fixed numpy random seed). Can you please fix it or tell me...

prateek-malhotra

Posterior model of random forest

2

Are there facilities for sampling from the posterior distribution of the random forest? (e.g. for integration with [Ax](https://ax.dev/docs/bayesopt.html)/[BoTorch](https://botorch.org/)).

sgbaird

Python/Scala bindings via py4j unstable

Use of multiple JVMs via py4j seems to crash on Linux. See also https://github.com/CitrineInformatics/smlb/issues/70

mrupp-citrine

bug

Disambiguate feature importance from get_importance_scores

Despite having a similar name, get_importance_scores is intended to capture the impact of training data, which is related but not identical to the gain scores we use to communicate feature...

gregor-robinson

Fail on invalid subsetStrategy

1

Currently `io.citrine.lolo.learners.RandomForest` (and `io.citrine.lolo.learners.ExtraRandomTrees`, which emulates the RF interface) defaults to automatic subset strategy selection when the parameter `subsetStrategy` is an invalid string. This is an opportunity for an unobservable...

gregor-robinson

Unbias standard deviation estimator

2

getStdDevMean currently uses ~a biased variance estimator~ the square root of the sample variance. This should be unbiased by replacing the denominator with ~`treePredictions.length - 1`~ `treePredictions.length - 1.5` or...

gregor-robinson

lolo
lolo copied to clipboard

Metadata

Use of lolopy RF regressor in Anaconda

How to save the fitted RandomForestRegressor model?

Training a RandomForestRegressor with boolean feature(s) results in model with no signal

Splittable random numbers for reproducible training

Not able to reproduce results.

Posterior model of random forest

Python/Scala bindings via py4j unstable

Disambiguate feature importance from get_importance_scores

Fail on invalid subsetStrategy

Unbias standard deviation estimator

← Metadata

Owner

Metadata

lolo lolo copied to clipboard

Metadata

← Metadata

Owner

Metadata

lolo
lolo copied to clipboard