MLJTutorial.jl icon indicating copy to clipboard operation
MLJTutorial.jl copied to clipboard

Tutorial 04: `predict(..., rows = 1:3)`

Open roland-KA opened this issue 3 years ago • 3 comments

In part 04 of the tutorial, in section 'The tuning wrapper' there is a predict statement applied to the tuned model/machine: predict(tuned_mach, rows = 1:3)

To me it is not quite clear why the parameter 1:3 is given to rows. I would have expected that XHorse (or a subset of it) would be used.

roland-KA avatar Nov 16 '21 18:11 roland-KA

Thanks for this query.

I would have expected that XHorse (or a subset of it) would be used.

It is being used. When you call predict(mach, rows=1:3) it is equivalent to predict(mach, Xsubset) where Xsubset = selectrows(X, 1:3) and X is the data you bound to the machine mach when you created it, in your case Xhorse, I guess. The general idea in training/validation is to pass rows around instead subsets of the data. It's probably good practice to split off a separate lock-away-and-throw-away-the-key test set Xtest, ytest at the beginning (you can use partition for tables and multidimensional arrays as well as vectors) and which you do not include with the data in your machine, but I didn't bother with that in these tutorials.

ablaom avatar Nov 17 '21 19:11 ablaom

Thanks for the explanation! It wasn't clear to me, that the data which is bound to the machine can be referenced using this syntax.

A short comment about this would be perhaps helpful for people reading the tutorial, because I think that they are neither familiar with the mechanisms a MLJ machine introduces.

roland-KA avatar Nov 18 '21 11:11 roland-KA

Yes, less leave this open to remind me.

ablaom avatar Nov 19 '21 04:11 ablaom