swift-snapshot-testing Feature/hashable diff

Add snapshot comparison for hashable elements based on their hash values.

The idea behind that is to be able to check UIView screenshot without having to store its heavy pixel array data. And then, to be able to use @applidium ADLayoutTest to generate a large amount of screenshot for a given view.

Improvement ideas :

Add pre-hash attachment to increase visibility on failures (mostly for UIImage)
Add snapshotting on Hashable Sequences to generate a single hash for a bench of values (ie. for a view configured with several view models)

May 03 '19 15:05 laviallb

Hi @laviallb! Thanks for taking the time to PR! We have a few questions because we want to better understand what you're solving for.

The idea behind that is to be able to check UIView screenshot without having to store its heavy pixel array data.

The hashValue of this array isn't guaranteed to be unique, right? There could be 2 large data arrays with different pixel data and the same hash, right? Is the idea that this would be rare?

I'm also wondering how you go about troubleshooting a failure when you have one. Is there any way to easily get to the image diff?

Have you considered using recursiveDescription as a way of saving space? Or even the new visualRecursiveDescription method that got added with iOS 13? https://github.com/pointfreeco/swift-snapshot-testing/issues/238

Jun 11 '19 21:06 stephencelis

Hi,

I think this is possible to assume that Swift hash is strong enough to make the collision probability small enough to consider the hash value unique.

Indeed, I tried to handle better failures, to provide a snapshot of the failing view, but I haven't been able to do that without modifying consequently your API; anyway if you think it is important I can had it to the PR. If you have any advice on how it could be done, I would be happy to have it. (This is the first improvement idea)

Yes, any textual description could be a way of reducing the snapshot data. But the final idea is to reduce hash sequence to a single hash. So that we make several snapshot of a given view with different randomly generated view model, hash them, reduce to a single hash, and the store just a hash for all of it. Indeed it would make it more complicated to get why a test fail, you have to run the test again to generate all the sequence images and find the failing one, but this allows to test a view in a lot of configuration without storing a large amount of snapshot on a git repository.

thanks

Jun 12 '19 06:06 laviallb

@laviallb Sorry, just circling around to this now. Brandon and I will try to chat about it tomorrow. It's definitely an interesting idea and we're starting to see the churn in this and other repos.

I'm wondering, did you ever consider git-lfs for snapshots as an alternative strategy?

May 06 '20 18:05 stephencelis

"The hashValue of this array isn't guaranteed to be unique, right? There could be 2 large data arrays with different pixel data and the same hash, right? Is the idea that this would be rare?" — @stephencelis

"I think this is possible to assume that Swift hash is strong enough to make the collision probability small enough to consider the hash value unique." — @laviallb

This should only be an issue when using hashValue.

If the type implemented it badly, then you get many collisions. Swift 4.2's Hasher should be practically collision-free, thanks to guaranteed dispersion quality.

There's more to look out for than just hash collisions though:

"Important: hashValue is deprecated as a Hashable requirement. To conform to Hashable, implement the hash(into:) requirement instead." — Hashable Documentation

As such instead of value.hashValue one would need to do this:
```
var hasher = Hasher()
value.hash(into: &hasher)
let hashValue = hasher.finalize()
```
"Important: Hash values are not guaranteed to be equal across different executions of your program. Do not save hash values to use in a future execution." — Hashable Documentation

"[…] However, Hasher may generate entirely different hash values in other executions, even if it is fed the exact same byte sequence. This randomization is a critical feature, as it makes it much harder for potential attackers to predict hash values. Hashable has always been documented to explicitly allow such nondeterminism." — SE-0206 (Hashable Enhancements)

As a general rule one should NEVER persist hash values. If one really, really, really positively needs to persist hash values, then SWIFT_DETERMINISTIC_HASHING provides a suitable escape hatch:

"In certain controlled environments, such as while running particular tests, it may be helpful to selectively disable hash seed randomization, so that hash values and the order of elements in set and dictionary values remain consistent across executions. You can disable hash seed randomization by defining the environment variable SWIFT_DETERMINISTIC_HASHING with the value of 1. The Swift runtime looks at this variable during process startup and, if it's defined, replaces the random seed with a constant value. (SE-0206) (35052153)" Swift 4.2 Release Notes

I would strongly advise against using (or recommending to use) SWIFT_DETERMINISTIC_HASHING by default though, as it makes Set/Dictionary order deterministic, which may lure users into writing invalid (yet possibly succeeding) order-dependent tests.

A superior alternative to SWIFT_DETERMINISTIC_HASHING would be to implement a reasonably simple hasher without seed randomization (such as FNV-1) and do …

var hasher = CustomHasher()
value.hash(into: &hasher)
let hashValue = hasher.finalize()

Unfortunately Hashable is tightly coupled to Swift.Hasher, and as such doesn't support custom hashers, like shown above.

^{(Disclaimer: I'm co-author of "Hashable Enhancements" (SE-0206))}

May 06 '20 23:05 regexident

swift-snapshot-testing swift-snapshot-testing copied to clipboard

Feature/hashable diff

swift-snapshot-testing
swift-snapshot-testing copied to clipboard