mlr3pipelines
mlr3pipelines copied to clipboard
Dataflow Programming for Machine Learning in R
* Add another function, e.g. `$operate()`, which uses the train_df() functions internally and accepts data.frame? (Should this set a state? Maybe `$operate(df, train = TRUE)`?) * Just have `$train()` and...
It is not possible to build a stacked learner using mlr3 pipelines where the feature union uses a different task type than the final prediction. The following example is following...
There should be another option besides `no_collapse_above_prevalence` which uses absolute count.
- don't assert on function with only one argument, we can't set `trafo = log` currently. - ~~give the head of the task to allow conversion between tasks.~~ Maybe there...
Use [PipeOpStratify](https://github.com/mlr-org/smashy/blob/master/R/PipeOpStratify.R). Things to consider: * Explicit output with defined channels as an option. * Empty multiplicities don't work yet I believe, this needs to be repaired. * option to...
The man page for `imputehist` does not explain what "imputed by (column-wise) histogram" means. It would be helpful to understand more about what this actually means!
this is because S3 printers have precedence before R6-printers.
All `Task`-`PipeOp`s should automatically call `$predict` on the `"test"`-rows, so operations (or learners) that come later in the stream can rely on the `"test"` rows being valid. Ideally this should...
... and not the datatypes it gets from `$data()` (https://github.com/mlr-org/mlr3/issues/685). Otherwise subsequent PipeOpTaskPreprocs can give errors about incongruent train/predict-tasks.
Another POFU issue... apparently `DataBackendRename$missings` breaks when queried for no columns. This happens because `cbind()` overwrites all columns in this case. ```r > gr >!% po("featureunion", innum = c("a", ""))...