digest icon indicating copy to clipboard operation
digest copied to clipboard

[Question] What methods are cross-platform

Open dipterix opened this issue 3 years ago • 3 comments

Sorry if I asked a dumb question. From all methods supported, is there a list of methods that are cross-platform reproducible?

The conditions include:

  • On different CPU arch (x86, x64, ARM)
  • On different OS (windows, osx, ubuntu, Solaris, ...)
  • Little / big endianess
  • Using digest(serialize=FALSE, seed=0)

For example, can expect digest("aaa", serialize=FALSE, algo="xxhash64") to produces the EXACT same results on all OS, endianess, and all CPUs?

dipterix avatar Apr 25 '22 17:04 dipterix

In general, there are two aspect there:

  • Is the object serialization the same? R deals with this, and you have to check serialize() for the string it creates

  • Is the hash digest the same? In general it should be.

Look for example at the package unit tests which check against invariant test output. Here is one for md5:

## Standard RFC 1321 test vectors
md5Input <-
    c("",
      "a",
      "abc",
      "message digest",
      "abcdefghijklmnopqrstuvwxyz",
      "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789",
      paste("12345678901234567890123456789012345678901234567890123456789012",
            "345678901234567890", sep=""))
md5Output <-
    c("d41d8cd98f00b204e9800998ecf8427e",
      "0cc175b9c0f1b6a831c399e269772661",
      "900150983cd24fb0d6963f7d28e17f72",
      "f96b697d7cb7938d525a2f31aaf161d0",
      "c3fcd3d76192e4007dfb496cca67e13b",
      "d174ab98d277d9f5a5611c2c9f419d9f",
      "57edf4a22be3c955ac49da2e2107b67a")

for (i in seq(along.with=md5Input)) {
    md5 <- digest(md5Input[i], serialize=FALSE)
    expect_true(identical(md5, md5Output[i]))
    #cat(md5, "\n")
}

md5 <- getVDigest()
expect_identical(md5(md5Input, serialize = FALSE), md5Output)

We run this test on every platform the package is checked. And we have similar checks for the other methods.

As your question was very specifically about xxhash64 I would encourage you to see what its upstream repo has to say about the matter. We just use it here as one among a number of hashing functions.

eddelbuettel avatar Apr 25 '22 17:04 eddelbuettel

Thanks for answering my question.

Is the hash digest the same? In general it should be.

Is this the case for sha-256?

Thanks

dipterix avatar Apr 25 '22 18:04 dipterix

Well everything I said in the previous applies here too so I do not understand what you are asking now.

Also, as you know, thhe package is open source and there are sha256 unit tests.

eddelbuettel avatar Apr 25 '22 19:04 eddelbuettel