archivist icon indicating copy to clipboard operation
archivist copied to clipboard

Getting different md5 hash keys for the same artifact under a unit-test settings (even though the archive is empty in each attempt).

Open harell opened this issue 5 years ago • 2 comments

When including an additional unit test to the package that checks for artifacts md5hash, the unit test fails.

Given an artifact, and a path pointing at where to create the archive

When I initialize the archive (deleting any former archive and creating a new one)
AND I use saveToLocalRepo to store in archive
AND I store the md5 hash key given by saveToLocalRepo

Then the md5 hash key should be identical to digest::digest(, "md5")

The following script tests the above.

It works well when running the code line-by-line, but fails when it's included as part of the test suit (under ~/tests/testthat).

test_that("showLocalRepo generates identical hash keys", {
    ###########
    ## Setup ##
    ###########
    repo_dir <- tempdir()
    mdl_1 <- lm(mpg ~ ., mtcars[,c(1,2)])


    #####################
    ## Helper Function ##
    #####################
    init_archive <- function(path){
        set.seed(1235)
        unlink(path, recursive = TRUE, force = TRUE)
        dir.create(path, showWarnings = FALSE, recursive = TRUE)
        archivist::createLocalRepo(path, force = TRUE, default = FALSE)
    }


    #####################################
    ## Create Object via showLocalRepo ##
    #####################################
    init_archive(repo_dir)
    md5hash_archivist_1 <- archivist::saveToLocalRepo(artifact = mdl_1,
                                                      repoDir = repo_dir,
                                                      value = FALSE,
                                                      force = TRUE)
    init_archive(repo_dir)
    md5hash_archivist_2 <- archivist::saveToLocalRepo(artifact = mdl_1,
                                                      repoDir = repo_dir,
                                                      value = FALSE,
                                                      force = TRUE)


    ###########
    ## Tests ##
    ###########
    expect_equal(md5hash_archivist_1, md5hash_archivist_2, check.attributes = FALSE, use.names = FALSE)
    expect_equal(md5hash_archivist_1, digest::digest(mdl_1, "md5"), check.attributes = FALSE, use.names = FALSE)
    expect_equal(md5hash_archivist_2, digest::digest(mdl_1, "md5"), check.attributes = FALSE, use.names = FALSE)
})

The errors show that the function results with different hash key values

image

Any idea what causes the unit test to fail?

harell avatar Mar 16 '19 00:03 harell

thanks will look into this

pbiecek avatar Apr 17 '19 13:04 pbiecek

Just got a hint why it happens in this thread It's potentially solvable with wrapping the artifact before hashing with a function. See note here

harell avatar Jul 25 '19 08:07 harell