mixOmics
mixOmics copied to clipboard
Enhancement for #205
test: first commit to trial the new test system which draws the ground truth values from external RData files
Hey @aljabadi
I'd love your feedback for this PR as I'm still in the early stages of it and any changes you suggest would be better now rather than later.
For a given test case (that isnt testing for a specific error), rather than just testing for a single numeric output, example:
expect_equal(matrix(auc.plsda$Comp1),rbind(0.863, 2.473e-05)
I am saving the entire output to a repo (found here) and loading it in within each test and testing the function against that entire output object, example:
GH.URL <- "https://github.com/Max-Bladen/mixOmicsWork/blob/main/Test%20Ground%20Truths/auroc/basic.plsda.RData?raw=true"
.load_url(GH.URL)
data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment
res.plsda <- plsda(X, Y, ncomp = 2)
plsda.auroc = auroc(res.plsda, roc.comp = 1, print = FALSE)
expect_equal(plsda.auroc, GT.plsda.auroc)
where GT.plsda.auroc
is the saved file at the GH.URL
(GT standing for Ground Truth).
My rationale is that when testing for individual numeric values, there is potentially unintended changes to the output values/structure outside of this one value that are being totally missed. The one downside I see to this is the runtime, such that running .load_url()
takes about 2-3seconds per call.
You can see that I'm trying to impose more of a structure to each test document, such that new tests can be added easily. I also want the analytical and graphical methods to have all (or most) of their potential input objects tested for, as well as ensuring each is tested on a variety of datasets. This structure and expansion to the test sets is outlined on this wiki page.
Hi @Max-Bladen,
This would be a great idea to extend the tests to further elements in the output. However, we want to keep tests as isolated from the outside of the repo as possible.
We also need to be aware that some of the output elements are generated by the dependencies (say a graph), and as long as they adjust their visualisation based on those, changes to them should not break the tests.
Here is what I suggest for expanding the tests to further elements:
- Subset the object to include only certain elements most important to our methods
- Ensure the subset is of a small size
- Ensure the object is representable by
dput
function (it's a function that allows you to print an object's definition and recreate that object programmatically, as opposed to having to save and reload). Be mindful to exclude non-standard elements (e.g.<environemnt>.
). See example below
library(testthat)
library(ggplot2)
# my ground truth output
output = list(data = data.frame(a=c(1,2,3)), graph = ggplot(data.frame(b=c(5,6,7))) + geom_point(), class=c('data.frame', 'foo'))
# dput shows the object definition which has non-standard elements (e.g. facet)
dput(output)
#> list(
#> data = structure(list(a = c(1, 2, 3)), class = "data.frame", row.names = c(NA,
#> -3L)),
#> graph = structure(
#> list(
#> data = structure(list(), class = "waiver"),
#> layers = list( <
#> environment > ),
#> scales = < environment > ,
#> mapping = structure(list(), names = character(0), class = "uneval"),
#> theme = list(),
#> coordinates = < environment > ,
#> facet = < environment > ,
#> plot_env = <
#> environment > ,
#> labels = list()
#> ),
#> class = c("gg",
#> "ggplot")
#> ),
#> class = c("data.frame", "foo")
#> )
# I'm only intersted in c('data', 'class') so I only keep them
test_els = c('data', 'class')
dput(output[test_els])
#> list(data = structure(list(a = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L)), class = c("data.frame", "foo"))
output_ground_truth = list(data = structure(list(a = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L)), class = c("data.frame", "foo"))
expect_equal(output[test_els], output_ground_truth)
# Note that I can be more specific, for instance by adding `output$graph$data` data.frame to
# the output and checking for that too, if that is relevant to my test case
You can event separate the definition of these ground truth elements inside testthat/helper-*.R
scripts only if they add too much clutter (see https://www.r-bloggers.com/2020/11/helper-code-and-files-for-your-testthat-tests/). I'd intentionally restrict test cases to anything that is implementable this way to avoid overcomplications of the package.
Thanks for the feedback @aljabadi . I agree with everything you mentioned. I didn't know about dput
, thats very handy. So instead of this (previously):
GH.URL <- "https://github.com/Max-Bladen/mixOmicsWork/blob/main/
Test%20Ground%20Truths/auroc/basic.plsda.RData?raw=true"
.load_url(GH.URL)
data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment
res.plsda <- plsda(X, Y, ncomp = 2)
plsda.auroc = auroc(res.plsda, roc.comp = 1, print = FALSE)
expect_equal(plsda.auroc, GT.plsda.auroc)
Do you feel something like this more appropriate
testable.components <- c("Comp1", "Comp2")
GT <- list(Comp1 = structure(c(0.863, 2.473e-05),
.Dim = 1:2,
.Dimnames = list("AF vs BE", c("AUC", "p-value"))),
Comp2 = structure(c(0.9981, 7.124e-09),
.Dim = 1:2,
.Dimnames = list("AF vs BE", c("AUC", "p-value"))))
data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment
res.plsda <- plsda(X, Y, ncomp = 2)
plsda.auroc = auroc(res.plsda, roc.comp = 1, print = FALSE)
invisible(capture.output(TT <- dput(plsda.auroc[testable.components])))
expect_equal(TT, GT)
Oh also, are you happy with the different types of tests and how I'm structuring things?
Sorry for the repeated messages @aljabadi. I've just been adjusting the test-auroc.R
files with the new style (can be seen in the above comment). Things get very messy, very quickly, even with minimal outputs - and this is using auroc()
which has a comparatively small set of output. Can I suggest that we keep the ground truths in a separate file for each test-....R
file? Maybe have a folder (eg ground truths
) within the tests
folder that we source. It would certainly make each file significantly cleaner
Hi @Max-Bladen, If you feel like storing test data is inevitable, I recommend you use inst/testdata. See https://stackoverflow.com/a/41777777.
Please be mindful that currently our package size exceeds the standard size of 5MB so this should be a last resort. The scripts that generate these files must be included within the test functions and only commented out.
Hi @Max-Bladen, looks awesome. Happy to see the files don't take much space.
If a function is improved and the outcomes change, what is the procedure for changing the fixture data? Is this documented and easy to implement?
The procedure is quite simple. I have some notes written down locally. Once I've worked my way through most of the work for this PR (see the last page of the Wiki), I'll change this page to contain a step-by-step process on how to changing this data. I've tried to design this with adaptability and expansion in mind
After disucssion, this idea was to be scraped.