cardano-base icon indicating copy to clipboard operation
cardano-base copied to clipboard

Convert from Hash to/from PinnedSizedBytes safely

Open lehins opened this issue 3 years ago • 9 comments
trafficstars

There is a way to implement a total function that does this conversion because we know statically both of their sizes. This came up in #289

lehins avatar Jul 21 '22 21:07 lehins

There seems to be an odd distinction between PackedBytes and PinnedSizedBytes at play here. Is there a reason we can't just use the same type for both?

kozross avatar Jul 21 '22 21:07 kozross

@kozross They are quite different underneath. PackedBytes are backed by Word64s, while PinnedSizedBytes are backed by an actual ByteArray

lehins avatar Jul 21 '22 21:07 lehins

I understand they are implemented quite differently: why can't we unify their representations? I don't really see why they need to be separate in the first place.

kozross avatar Jul 21 '22 22:07 kozross

It is right in the name: one is pinned another one is not :smile:

lehins avatar Jul 21 '22 22:07 lehins

Yeah, I guess that's true, but at the same time, I feel that making both things use ByteArray underneath would make such conversions easier. We could also share APIs in most cases, provided we don't accidentally mix pinned and unpinned things (which these wrappers would prevent).

kozross avatar Jul 21 '22 22:07 kozross

I'm also wondering if there'd ever come a time when we might want to generate hashes via the FFI: we don't need this currently, but we might one day.

kozross avatar Jul 21 '22 22:07 kozross

I feel that making both things use ByteArray underneath would make such conversions easier.

I see what you are trying to say. Putting pinnedness aside, the whole point why PackedBytes exist is because this:

data Hash32 = Hash32 {-# UNPACK #-} !Word64 {-# UNPACK #-} !Word64 {-# UNPACK #-} !Word64 {-# UNPACK #-} !Word64

will use 30% less memory than

newtype Hash32 = Hash32 (PinnedSizedBytes 32)

We store a lot of hashes in memory and it makes a difference.

I'm also wondering if there'd ever come a time when we might want to generate hashes via the FFI: we don't need this currently, but we might one day.

We are already producing hashes over ffi with libsodium, we just immediately convert them to PackedBytes, so they don't contribute to memory fragmentation and consume less memory, as mentioned above.

lehins avatar Jul 21 '22 22:07 lehins

Interesting - I've learned something new today. In that case, the difference in representation definitely makes sense. It would still be nice to have a safe conversion though.

kozross avatar Jul 21 '22 23:07 kozross

It would still be nice to have a safe conversion though.

That's exactly what this ticket is about :wink:

lehins avatar Jul 21 '22 23:07 lehins