mlr icon indicating copy to clipboard operation
mlr copied to clipboard

test.join aggregation does not join probabilities

Open berndbischl opened this issue 9 years ago • 6 comments

task = sonar.task
lrn = makeLearner("classif.rpart", predict.type = "prob")

rin = makeResampleDesc("Holdout")
mm = setAggregation(auc, test.join)
r = resample(lrn, task, rin, measures = mm)

This leads to Error in getPredictionProbabilities(pred, cl = levs) : Trying to get probabilities for nonexistant classes: M,R

It is also obvious from the source code. NB: we have potentially more columns we need to join like se and the stuff from survival...

@mllg as you pushed this some time ago

berndbischl avatar Dec 19 '15 11:12 berndbischl

Did the PR solve this completely or should we leave this open?

larskotthoff avatar Oct 13 '16 07:10 larskotthoff

The part

NB: we have potentially more columns we need to join like se and the stuff from survival...

was not addressed in the PR, therefore we could leave this open.

giuseppec avatar Oct 13 '16 07:10 giuseppec

This error still prevails

task = wpbc.task
lrn = makeLearner("surv.rpart")
rin = makeResampleDesc("Holdout")
mm = setAggregation(cindex, test.join)
r = resample(lrn, task, rin, measures = mm)
#[Resample] holdout iter: 1
#Error in FUN(X[[i]], ...) : 
#  You need to have 'truth.time' and 'truth.event' columns in your pred object for measure cindex!

jakob-r avatar Nov 10 '16 15:11 jakob-r

And

task = yeast.task
lrn = makeLearner("multilabel.cforest")
rin = makeResampleDesc("Holdout")
mm = setAggregation(multilabel.acc, test.join)
r = resample(lrn, task, rin, measures = mm)
#[Resample] holdout iter: 1
#Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
#  invalid 'row.names' length

jakob-r avatar Nov 10 '16 16:11 jakob-r

In other words: We should not write custom code for every task class in test.join but find a generalized solution.

jakob-r avatar Nov 10 '16 16:11 jakob-r

library(mlr)
#> Loading required package: ParamHelpers

task = wpbc.task
lrn = makeLearner("surv.rpart")
rin = makeResampleDesc("Holdout")
mm = setAggregation(cindex, test.join)
r = resample(lrn, task, rin, measures = mm)
#> Resampling: holdout
#> Measures:             cindex
#> [Resample] iter 1:    0.6156373
#> Error in FUN(X[[i]], ...): You need to have 'truth.time' and 'truth.event' columns in your pred object for measure cindex!

Created on 2019-12-31 by the reprex package (v0.3.0)

pat-s avatar Dec 31 '19 13:12 pat-s