smartcore
smartcore copied to clipboard
refactor: do not require an empty struct in cross_validate to define …
…which is the estimator
Fixes https://github.com/smartcorelib/smartcore/issues/230
Checklist
- [ ] My branch is up-to-date with development branch.
- [ ] Everything works and tested on latest stable Rust.
- [ ] Coverage and Linting have been applied
Current behaviour
New expected behaviour
Change logs
Even though, it looks like more easy to use this passing the parameter, it looks weird to me to pass an unitialized empty struct, instead we can take advantage of that we can explicitly define which is the Estimator that we want to use using turbofish syntax
@morenol from a programming perspective I agree, but writing ::<_, _, _, _, _, LogisticRegression<_, _, _, _>, _, _>
is really not viable for an interface that has to face an end-user. Think if you had to write that to issue a request using Python requests
: requests.request::<_,_,_,_, POST,_,_,_>(...)
?!!? would you ever use something like that?
If you want to remove the new
there are these options imo:
- passing the single
fit
function usingFn
orFnOnce
but it is hard as every estimator has different signatures - passing the string name
"LogisticRegression"
and instantiate an object insidefit
using an enumerator (as proposed for Kernel in #221) or an enumerator key likeEstimatorEnum::LogisticRegression
- refactoring
cross_validate
in a way that the user has to write onlycross_validate::<LogisticRegression>(...)
- anything else I don't know about
I used new
because there are some other constraints in place and the alternatives are not straightforward. As we have as a main point to keep an easy "pythonic" API we cannot allow Rust's types to reach the end-user's level.
PS: I don't like how cross_validate
and cross_validate_predict
are implemented, so maybe we should rewrite them in a better way. This could also lead to a better definition of the SupervisedEstimator
and Predictor
traits. For an example see fit and predict.
Codecov Report
Merging #231 (e95798d) into development (aab3817) will increase coverage by
0.11%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## development #231 +/- ##
===============================================
+ Coverage 43.93% 44.05% +0.11%
===============================================
Files 85 85
Lines 7290 7268 -22
===============================================
- Hits 3203 3202 -1
+ Misses 4087 4066 -21
Impacted Files | Coverage Δ | |
---|---|---|
src/ensemble/random_forest_classifier.rs | 53.54% <ø> (+0.34%) |
:arrow_up: |
src/ensemble/random_forest_regressor.rs | 57.37% <ø> (+0.46%) |
:arrow_up: |
src/linear/elastic_net.rs | 50.00% <ø> (+0.39%) |
:arrow_up: |
src/linear/lasso.rs | 50.00% <ø> (+0.45%) |
:arrow_up: |
src/linear/linear_regression.rs | 35.41% <ø> (+1.41%) |
:arrow_up: |
src/linear/logistic_regression.rs | 38.14% <ø> (+0.19%) |
:arrow_up: |
src/linear/ridge_regression.rs | 40.17% <ø> (+0.34%) |
:arrow_up: |
src/model_selection/mod.rs | 72.22% <ø> (ø) |
|
src/naive_bayes/bernoulli.rs | 41.56% <ø> (+0.84%) |
:arrow_up: |
src/naive_bayes/categorical.rs | 34.50% <ø> (-1.16%) |
:arrow_down: |
... and 14 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
@morenol from a programming perspective I agree, but writing
::<_, _, _, _, _, LogisticRegression<_, _, _, _>, _, _>
is really not viable for an interface that has to face an end-user. Think if you had to write that to issue a request using Pythonrequests
:requests.request::<_,_,_,_, POST,_,_,_>(...)
?!!? would you ever use something like that?If you want to remove the
new
there are these options imo:
- passing the single
fit
function usingFn
orFnOnce
but it is hard as every estimator has different signatures- passing the string name
"LogisticRegression"
and instantiate an object insidefit
using an enumerator (as proposed for Kernel in Deserialization for Kernel #221) or an enumerator key likeEstimatorEnum::LogisticRegression
- refactoring
cross_validate
in a way that the user has to write onlycross_validate::<LogisticRegression>(...)
- anything else I don't know about
I used
new
because there are some other constraints in place and the alternatives are not straightforward. As we have as a main point to keep an easy "pythonic" API we cannot allow Rust's types to reach the end-user's level.PS: I don't like how
cross_validate
andcross_validate_predict
are implemented, so maybe we should rewrite them in a better way. This could also lead to a better definition of theSupervisedEstimator
andPredictor
traits. For an example see fit and predict.
I agree, I am not fan of having an API were something like this is required ::<_, _, _, _, _, LogisticRegression<_, _, _, _>, _, _>
. I will keep this open until I find a better way
Even though, it looks like more easy to use this passing the parameter, it looks weird to me to pass an unitialized empty struct, instead we can take advantage of that we can explicitly define which is the Estimator that we want to use using turbofish syntax
Even though, it looks like more easy to use this passing the parameter, it looks weird to me to pass an unitialized empty struct, instead we can take advantage of that we can explicitly define which is the Estimator that we want to use using turbofish syntax
see discussion at https://github.com/smartcorelib/smartcore/issues/230