Rik Huijzer issues

Results 90 issues of


Rik Huijzer

Regression likely contains a bug

There seems to be something wrong with the regression-case since the performance decreases when the number of `max_rules` increases. This should be the other way around. It could be that...

enhancement

Improve the advanced example in the documentation

> I think some content of "Advanced example" could go to "Implementation overview". Also "Benchmarks" could perhaps go on its own section so that "advanced example" could focus on how...

`@load StableRulesClassifier` is still broken

``` pkg> activate --temp pkg> add MLJModels Resolving package versions... Updating `/private/var/folders/nf/xynnllys6nl13xhj67sqgt7c0000gn/T/jl_AG8cXc/Project.toml` [d491faf4] + MLJModels v0.16.12 [...] pkg> add SIRUS Resolving package versions... Updating `/private/var/folders/nf/xynnllys6nl13xhj67sqgt7c0000gn/T/jl_AG8cXc/Project.toml` [cdeec39e] + SIRUS v1.3.3 [...]...

Remove rules for which both sides are approximately zero

Classes might become inconsistent between folds

Classes are determined for each fold separately. This could go wrong if some class is not contained in one of the folds. To fix this, add a check to test...

Multiclass-classification target cannot be `String`s

Passing numbers only is a bit of a hassle. It would be nice to be able to pass `String`s.

enhancement

Switch output printing to `PrettyTables.jl`

A table will be more clearly readable than the current output.

Make #features per tree a hyperparameter

For regression, default to `n/3` and also allow overriding the setting. See https://datascience.stackexchange.com/a/23677 for details.

Does a non-greedy approach improve accuracy?

Non-greedy binary splitting shouldn’t be too expensive. Number of points to check is `q * sqrt(p)` versus `(q * sqrt(p))^2`. Maybe only enable for reasonably low `p` and `q`.

Remove rules for which outcomes don't differ

Say there is a rule `A < 10, then a else b` where `a ≅ b` then the rule can be removed.