Classes that sha1() should have methods for in digest
- Date
- complex
- array
Already implement types:
complexconverted to a 2-column matrix (Re, Im)Dateconverted to an integer (positive/negative) representing number of seconds since 1970
Any ideas for the array type ?
Same as for the other two? There is as.array() though I am not sure what one usually converts from...
array is a multi-dimensional matrix. One of the problems is to produce different hashes for differently shaped arrays that contain the same data:
# the same content for differenly shaped arrays
content <- 1:8
# creates a cube 2x2x2
a <- array(content, dim = c(2,2,2))
b <- array(content, dim = c(2,4,1))
# structures to be hashed
a_tohash <- list(
dim = dim(a),
data = as.numeric(a)
)
b_tohash <- list(
dim = dim(b),
data = as.numeric(b)
)
# should produce different hashes
sha1(a_tohash)
sha1(b_tohash)
> sha1(a_tohash)
[1] "96dbde57a89f37cf8fbbf9e706047a3fc822e705"
> sha1(b_tohash)
[1] "d51b50b0a6414a453e909a5d74686156edb099ef"
any other problems that need to be addressed before I create a pull request ?
Ah, yes, of course.
Well a 1:8 sequence with dim(2,2,2) is different from dim(2,4,1) so I think this is good enough.
Unit tests maybe (in the super-simple format we use here) ?
Note: the current implementation assumes that: sha1(2+0i) != sha1(2) (conservative approach, but maybe mathematically incorrect)
I've just found a related section in the vignette titled "Creating a sha1 method for other classes": https://github.com/eddelbuettel/digest/blob/master/vignettes/sha1.Rmd#creating-a-sha1-method-for-other-classes
Useful for new contributors.
Any types that should also be implemented ?
I've started this issue so everyone could add other classes to the wishlist. Keep in mind that @eddelbuettel wants to keep the list of dependencies as minimal as possible. That limits the classes to those available in base R.
Yes, I think anything besides base R is likely to be out of scope. Unless we all agree that it is worth it.
But minimal dependencies are a Good Thing (TM) for a widely-used utility package such as this.
what about zoo class for time series ?
I love zoo and and xts -- but this would get a little out of hand.
We cannot possibly follow up to every other S3 class on CRAN. Base types is good.
#41 add sha1 methods for the pairlist and name classes. It also adds documentation for the sha1 methods on array, complex and Date
#54 adds a sha1 method for the "function" class. e.g. sha1(sum)
Thanks for that. Looks like a very clean and straightforward PR. Will merge once Travis completes.
The current implementation of sha1.function(x) takes environment(x) into account. This can lead to differences in sha1 hash when storing and retrieve the function in an S4 object. I am experiencing that problem now.
A solution would be to update sha1.function(x) so that it ignores the environment. Another option is to use an extra argument which allows the user to should whether the environment should be taken into account. In that case sha1() should gain a ... argument.
I'll make a PR after feedback from @eddelbuettel
From the top of my head I like the extra argument idea better.
Another pair of classes that should have a method: formula and environment?
Assuming that formula gets a method, I think that ideally there would be an option to ignore the attached environment (equivalent to sha1.function()).