testthat icon indicating copy to clipboard operation
testthat copied to clipboard

Add option to expect_snapshot_*() not print entire value

Open mpjashby opened this issue 2 years ago • 1 comments

When called via devtools::check() or similar, expect_snapshot_value() prints the snapshot if it is new or has changed. This is useful for short values, but for longer values it floods the console or RStudio Build panel. For very-long values this can drastically slow RStudio for several minutes.

For example, testing if just the first six rows of the mtcars dataset are unchanged produces quite lengthy output. I've used the style = "serialize" argument here not because it's the best way to encode this particular value, but because serialisation is often the only way to make a snapshot of complex values.

library(testthat)

test_that("`mtcars` has the same value it always had", {
  local_edition(3)
  expect_snapshot_value(head(mtcars), style = "serialize")
})
#> Can't compare snapshot to reference when testing interactively
#> ℹ Run `devtools::test()` or `testthat::test_file()` to see changes
#> ℹ Current value:
#> WAoAAAACAAQBAgACAwAAAAMTAAAACwAAAA4AAAAGQDUAAAAAAABANQAAAAAAAEA2zMzMzMzN
#> QDVmZmZmZmZAMrMzMzMzM0AyGZmZmZmaAAAADgAAAAZAGAAAAAAAAEAYAAAAAAAAQBAAAAAA
#> AABAGAAAAAAAAEAgAAAAAAAAQBgAAAAAAAAAAAAOAAAABkBkAAAAAAAAQGQAAAAAAABAWwAA
#> AAAAAEBwIAAAAAAAQHaAAAAAAABAbCAAAAAAAAAAAA4AAAAGQFuAAAAAAABAW4AAAAAAAEBX
#> QAAAAAAAQFuAAAAAAABAZeAAAAAAAEBaQAAAAAAAAAAADgAAAAZADzMzMzMzM0APMzMzMzMz
#> QA7MzMzMzM1ACKPXCj1wpEAJMzMzMzMzQAYUeuFHrhQAAAAOAAAABkAE9cKPXCj2QAcAAAAA
#> AABAAo9cKPXCj0AJuFHrhR64QAuFHrhR64VAC64UeuFHrgAAAA4AAAAGQDB1wo9cKPZAMQUe
#> uFHrhUAynCj1wo9cQDNwo9cKPXFAMQUeuFHrhUA0OFHrhR64AAAADgAAAAYAAAAAAAAAAAAA
#> AAAAAAAAP/AAAAAAAAA/8AAAAAAAAAAAAAAAAAAAP/AAAAAAAAAAAAAOAAAABj/wAAAAAAAA
#> P/AAAAAAAAA/8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA4AAAAGQBAAAAAA
#> AABAEAAAAAAAAEAQAAAAAAAAQAgAAAAAAABACAAAAAAAAEAIAAAAAAAAAAAADgAAAAZAEAAA
#> AAAAAEAQAAAAAAAAP/AAAAAAAAA/8AAAAAAAAEAAAAAAAAAAP/AAAAAAAAAAAAQCAAAAAQAE
#> AAkAAAAFbmFtZXMAAAAQAAAACwAEAAkAAAADbXBnAAQACQAAAANjeWwABAAJAAAABGRpc3AA
#> BAAJAAAAAmhwAAQACQAAAARkcmF0AAQACQAAAAJ3dAAEAAkAAAAEcXNlYwAEAAkAAAACdnMA
#> BAAJAAAAAmFtAAQACQAAAARnZWFyAAQACQAAAARjYXJiAAAEAgAAAAEABAAJAAAACXJvdy5u
#> YW1lcwAAABAAAAAGAAQACQAAAAlNYXpkYSBSWDQABAAJAAAADU1hemRhIFJYNCBXYWcABAAJ
#> AAAACkRhdHN1biA3MTAABAAJAAAADkhvcm5ldCA0IERyaXZlAAQACQAAABFIb3JuZXQgU3Bv
#> cnRhYm91dAAEAAkAAAAHVmFsaWFudAAABAIAAAABAAQACQAAAAVjbGFzcwAAABAAAAABAAQA
#> CQAAAApkYXRhLmZyYW1lAAAA/g==
#> ── Skip (???): `mtcars` has the same value it always had ───────────────────────
#> Reason: empty test

Created on 2022-01-12 by the reprex package (v2.0.1)

Since expect_snapshot_value() is specifically designed to be used "when the expected value is large", it may be useful to have an argument of the expect_snapshot_*() family of functions that does not print the function, e.g. a quiet = TRUE argument that is FALSE by default (or when the value is shorter than a given threshold).

mpjashby avatar Jan 12 '22 22:01 mpjashby

This is definitely something needed. The printing to the console of the serialized data takes several minutes while the actual writing to file takes a few seconds.

bakaburg1 avatar Jan 17 '22 18:01 bakaburg1

If this is affecting you, I suspect you're using snapshot tests incorrectly — I don't think you should generally be snapshotting large objects.

hadley avatar Sep 19 '22 20:09 hadley

@hadley, maybe you should talk to one Hadley Wickham who wrote the sentence mpjashby quoted from the docs for snapshots:

Snapshot tests are useful for when the expected value is large

:)

At the very least this is a case of the docs contradicting the implementation.

stefanfritsch avatar Oct 02 '22 20:10 stefanfritsch

I was meaning large = a few lines of code, too much too type by hand. Not the hundreds or thousands of lines it would take to make the RStudio console take several minutes to print.

hadley avatar Oct 03 '22 11:10 hadley

If someone told you they were working with "large" values in R, would you expect they were talking about results <1MB? As I said above, you should at least be more explicit in the docs.

In data projects - as opposed to utility packages - I like to have some tests where I check for changes in the results as a whole and the main steps in the process. It's a nice tripwire to check if a fix in one place has repercussions in places I don't expect. And it verifies that only the expected steps are affected by a change (call it a poor man's targets). No, it's not unit testing by the book but it certainly is useful.

Now, I could use saveRDS(), readRDS() and expect_equal() for that use case; but you could say the same for every other use of expect_snapshot() (if you add capture.output()) or expect_known_value().

If you don't want to create a package for people not following your principles that's absolutely ok. But the functionality is already there - you just deprecated it without replacement and without immediate need afaict.

stefanfritsch avatar Oct 05 '22 10:10 stefanfritsch