mlr3torch fix batch

#420

Need to add tests

I still think the batch_size parameter should be required for prediction.

Aug 01 '25 14:08 sebffischer

this seems reasonable, but there are ci errors:


══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test_PipeOpModule.R:6:3'): PipeOpModule: basic checks ───────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
Backtrace:
    ▆
 1. └─mlr3torch:::expect_pipeop(po_fn) at test_PipeOpModule.R:6:3
 2.   └─mlr3torch:::expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values) at tests/testthat/helper_functions.R:64:3
── Error ('test_PipeOpTaskPreprocTorch.R:3:3'): basic ──────────────────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
Backtrace:
    ▆
 1. └─mlr3torch:::expect_pipeop(po_test) at test_PipeOpTaskPreprocTorch.R:3:3
 2.   └─mlr3torch:::expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values) at tests/testthat/helper_functions.R:64:3
── Error ('test_PipeOpTorch.R:6:3'): Basic checks ──────────────────────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
Backtrace:
    ▆
 1. └─mlr3torch:::expect_pipeop(obj) at test_PipeOpTorch.R:6:3
 2.   └─mlr3torch:::expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values) at tests/testthat/helper_functions.R:64:3
── Error ('test_PipeOpTorchBlock.R:5:3'): linear graph ─────────────────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
...

Aug 29 '25 15:08 tdhock

Yeah, it's not done yet. Got sidetracked but will probably address this monday.

Aug 30 '25 09:08 sebffischer

For a test case I would suggest to modify the example code from https://github.com/tdhock/mlr3torchAUM/blob/main/man/batch_sampler_stratified.Rd

First I installed mlr3torch from this PR branch, then I ran the example code.

devtools::install_github("mlr-org/mlr3torch@dont-require-batch-size")
## Imbalanced version of sonar data set.
sonar_task <- mlr3::tsk("sonar")
sonar_task$col_roles$stratum <- "Class"
sonar_task$filter(208:86) # for imbalance.
batch_sampler_class <- mlr3torchAUM::batch_sampler_stratified(min_samples_per_stratum = 1)

L_size_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_size_sampler$param_set$set_values(
  epochs=1, batch_size=10, seed=1,
  batch_sampler=batch_sampler_class)
L_size_sampler$train(sonar_task)

L_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_sampler$param_set$set_values(
  epochs=1, seed=1,
  batch_sampler=batch_sampler_class)
L_sampler$train(sonar_task)

L_no_shuffle <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_no_shuffle$param_set$set_values(
  epochs=1, seed=1, shuffle=NULL,
  batch_sampler=batch_sampler_class)
L_no_shuffle$train(sonar_task)
L_no_shuffle$predict(sonar_task)

I get the output below

> L_size_sampler$param_set$set_values(
+ epochs=1, batch_size=10, seed=1,
+ batch_sampler=batch_sampler_class)
> L_size_sampler$train(sonar_task)
Error: Provide either 'sampler', 'batch_sampler', or 'batch_size'.

The output above is an error message, which is expected, because both batch size and sampler were provided, but only one should be.

> L_sampler$param_set$set_values(
+ epochs=1, seed=1,
+ batch_sampler=batch_sampler_class)
> L_sampler$train(sonar_task)
Error: 'shuffle' and 'drop_last' are only allowed when 'batch_size' is provided.

The output above is another error message, indicating that shuffle should not be provided. Could this error message please be suppressed for user-friendlyness? (in this case this user did not explicitly set shuffle, but its default value generates this error, which seems unexpected)

> L_no_shuffle$param_set$set_values(
+ epochs=1, seed=1, shuffle=NULL,
+ batch_sampler=batch_sampler_class)
> L_no_shuffle$train(sonar_task)
> L_no_shuffle$predict(sonar_task)
Error in .__LearnerTorch__.dataloader_predict(self = self, private = private,  : 
  'batch_size' must be provided for prediction.

Above we see that if shuffle is set to NULL then training works without error (as expected), but prediction gives an error (not expected). Also in this case it would be more user-friendly if we did not have to explicily set shuffle=NULL.

I will create simpler test cases based on these examples.

Nov 19 '25 15:11 tdhock

I propose the following test cases

sonar_task <- mlr3::tsk("sonar")

batch_sampler_class <- torch::sampler(
  "Batch_Sampler_Always_Return_One",
  initialize = function(...){},
  .iter = function() function() 1,
  .length = function() 1)

library(testthat)

test_that("batch_size with batch_sampler is an error", {
  L_size_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
  L_size_sampler$param_set$set_values(
    epochs=1, batch_size=10,
    batch_sampler=batch_sampler_class)
  expect_error({
    L_size_sampler$train(sonar_task)
  }, "Provide either 'sampler', 'batch_sampler', or 'batch_size'.")
})

test_that("train and predict work with batch_sampler, not set batch_size, nor shuffle", {
  L_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
  L_sampler$param_set$set_values(
    epochs=1,
    batch_sampler=batch_sampler_class)
  L_sampler$train(sonar_task)
  expect_is(L_sampler$model, "learner_torch_model")
  pred <- L_sampler$predict(sonar_task)
  expect_is(pred, "PredictionClassif")
})

the first test case passes using the current code, whereas the second test case fails.

For the training to work without user setting shuffle, I would suggest updating how the default shuffle is handled -- shuffle default is TRUE. possibl fixes:

change default shuffle to NULL? (which could be interpreted as TRUE when batch_size is provided?)
change error checking logic so that shuffle TRUE is not an error with batch_sampler?
keep current shuffle logic and update test to set shuffle=NULL? (seems not user-friendly)

For the prediction to work, I guess there needs to be some change in how the batch_size is defined for the prediction data loader?

Does that seem reasonable @sebffischer ?

Nov 19 '25 15:11 tdhock

I still think the batch_size parameter should be required for prediction.

why?

batch size is a concept which is defined for training (= number of samples used to compute gradient), and there are no gradients involved in prediction.

Nov 20 '25 14:11 tdhock

fix batch_size param