roc icon indicating copy to clipboard operation
roc copied to clipboard

Builtin SHA256 hashing

Open MatthewJohnHeath opened this issue 1 year ago • 19 comments

Expose the functionality of Zig's std.crypto.hash.sha2.Sha256 through a pure-functional interface to Roc.

MatthewJohnHeath avatar Aug 09 '24 17:08 MatthewJohnHeath

@MatthewJohnHeath I tried to push to your branch but I got permission denied. Made a PR in case that helps.

lukewilliamboswell avatar Aug 14 '24 01:08 lukewilliamboswell

Hi Luke, Sorry I'm slow to respond. I've been travelling. There are definitely some problems on the branch; it builds but then in the REPL the functions seem to be missing (although triggering a different error to actually non-existent function names). I created the initial PR mostly in order to get feedback on what might be wring (I was also expecting to be able to get more time online these couple of weeks). I think I am going to need some help to get it working.

On Wed, 14 Aug 2024 at 02:26, Luke Boswell @.***> wrote:

@MatthewJohnHeath https://github.com/MatthewJohnHeath I tried to push to your branch but I got permission denied. Made a PR in case that helps.

— Reply to this email directly, view it on GitHub https://github.com/roc-lang/roc/pull/6977#issuecomment-2287656759, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZKVQQ7LL7GIFEDIJSQHBTZRKW3PAVCNFSM6AAAAABMI4AGLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBXGY2TMNZVHE . You are receiving this because you were mentioned.Message ID: @.***>

-- Matthew Heath http://mattheath.wordpress.com

MatthewJohnHeath avatar Aug 20 '24 09:08 MatthewJohnHeath

The shape of this PR is a good start, but I'm seeing a potential issue if we add other cryptographic code to this module, which would be quite unsurprising. You have emptySha256, addBytes, and digest. The first is named based on the hash, but the second and third are nominally agnostic to the hash, but only work for SHA-256. Could you rename them to addBytesSha256 or something similar?

smores56 avatar Aug 30 '24 19:08 smores56

Otherwise, I'm attempting to get things running

smores56 avatar Aug 30 '24 19:08 smores56

Okay, I'm almost done getting this working, just need to coordinate the Zig and Roc builtin types.

smores56 avatar Aug 30 '24 20:08 smores56

The REPL worked for me! We'll still need some tests, but it should be good to go, above comments notwithstanding.

smores56 avatar Aug 30 '24 20:08 smores56

Also, there's no code currently to extract state from the new Digest256 opaque type. Could you add a function or something to do that? Maybe Crypt.digest256ToBytes : Digest256 -> List U8?

smores56 avatar Aug 30 '24 21:08 smores56

Also, there's no code currently to extract state from the new Digest256 opaque type. Could you add a function or something to do that? Maybe Crypt.digest256ToBytes : Digest256 -> List U8?

I was thinking of (only) having the type implement Eq. Can you think of use cases where the individual bytes would be needed?

MatthewJohnHeath avatar Aug 31 '24 15:08 MatthewJohnHeath

I was thinking of (only) having the type implement Eq. Can you think of use cases where the individual bytes would be needed?

If I want to store the hash for future comparison. Not that using SHA-256 would be recommended over Argon2, but the idea holds. There are a few algorithms out there that use SHA-256 digest bytes for computation (Wikipedia), at least UUID v5 uses SHA-1 in that way.

smores56 avatar Sep 02 '24 09:09 smores56

Tests are currently failing. @MatthewJohnHeath do you have time to implement the above fixes any time soon? If not, I can help, but it's your PR, so I'll defer to you.

smores56 avatar Sep 09 '24 17:09 smores56

Tests are currently failing. @MatthewJohnHeath do you have time to implement the above fixes any time soon? If not, I can help, but it's your PR, so I'll defer to you.

@smores56 It's not really clear to me why the build is hanging (which seems to be what is happening). I will try a few things, but might need to ask for help

MatthewJohnHeath avatar Sep 09 '24 18:09 MatthewJohnHeath

@smores56 It's building now. Hopefully I can do those last 2 renamings of functions you asked for tomorrow.

MatthewJohnHeath avatar Sep 09 '24 21:09 MatthewJohnHeath

If this gets through the tests, would it be to merge and then do docs and tests? Or do those need to be there first?

MatthewJohnHeath avatar Sep 11 '24 17:09 MatthewJohnHeath

It's always preferable to do at least testing, but ideally also docs, in the same PR. We'd usually avoid them if they're going to make the PR too big, or if they're blocking us merging an important change/bug fix. Since it's neither of those, I'd prefer if we can get them in this change. I'm happy to help with either the tests or docs!

smores56 avatar Sep 11 '24 18:09 smores56

This all looks great! A heads up, we're considering how we should name the module Crypt in Zulip, but that would if anything change things after this PR.

smores56 avatar Sep 13 '24 16:09 smores56

The test that is blocking merge is unrelated, I'll look into unblocking this PR on Sunday, unless someone else gets the chance to fix this before then.

smores56 avatar Sep 13 '24 17:09 smores56

I'm a little concerned nothing seemed to fail as a result of fromHexString having been wrong. Is there a known reason that the "expects" in Crypt.roc wouldn't be checked?

MatthewJohnHeath avatar Sep 16 '24 16:09 MatthewJohnHeath

So I made progress on running things. Tests are running for the builtins, and there is some LLVM issue I'll have to investigate tomorrow:

smores@smortress:~/dev/roc$ cargo run --release --bin roc -- test crates/compiler/builtins/roc/Crypt.roc
    Finished release [optimized] target(s) in 0.16s
     Running `target/release/roc test crates/compiler/builtins/roc/Crypt.roc`
Call parameter type does not match function signature!
%list.RocList %"#arg2"
 ptr  %call_builtin = call i64 @roc_builtins.crypt.sha256AddBytes(i64 %"#arg1", %list.RocList nocapture nonnull readonly byval(%list.RocList) %"#arg2"), !dbg !1635


Function "Crypt_sha256AddBytes_ade2dc9e385e74c61c8d36210907131d7823a7fe8d06c7bd978c1f46a9d3830" failed LLVM verification in NON-OPTIMIZED build. Its content was:

define internal fastcc i64 @Crypt_sha256AddBytes_ade2dc9e385e74c61c8d36210907131d7823a7fe8d06c7bd978c1f46a9d3830(i64 %"#arg1", %list.RocList %"#arg2") !dbg !1634 {
entry:
  %call_builtin = call i64 @roc_builtins.crypt.sha256AddBytes(i64 %"#arg1", %list.RocList nocapture nonnull readonly byval(%list.RocList) %"#arg2"), !dbg !1635
  ret i64 %call_builtin, !dbg !1635
}
thread 'main' panicked at crates/compiler/gen_llvm/src/llvm/build.rs:5810:21:
😱 LLVM errors when defining function "Crypt_sha256AddBytes_ade2dc9e385e74c61c8d36210907131d7823a7fe8d06c7bd978c1f46a9d3830"; I wrote the full LLVM IR to "/tmp/nix-shell.7F740v/test.ll"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

smores56 avatar Oct 08 '24 08:10 smores56

I've been distracted, but I'll be looking at this again today. I might have to ask for help from @bhansconnect since I'm not much of an LLVM guy, which seems to be the last issue here. Needless to say, @MatthewJohnHeath there's nothing left that you need to do on this PR. If you have any changes you want to make, feel free, but otherwise I'm planning on merging as soon as I fix the LLVM issue.

smores56 avatar Oct 21 '24 21:10 smores56

I merged in main, renamed to Crypto, fixed the LLVM issues (I think -- at least it's compiling now). Running those Crypto.roc tests I get the following.

$ cargo run --release --bin roc -- test crates/compiler/builtins/roc/Crypto.roc
    Finished release [optimized] target(s) in 0.66s
     Running `target/release/roc test crates/compiler/builtins/roc/Crypto.roc`
── EXPECT FAILED in crates/compiler/builtins/roc/Crypto.roc ────────────────────

This expectation failed:

63│>  expect
64│>      data : List U8
65│>      data = []
66│>      want = digestBytesOfEmpty
67│>      got = data |> hashSha256 |> digest256ToBytes
68│>      want == got

When it failed, these variables had these values:

data : List U8
data = []

want : List U8
want = [227, 176, 196, 66, 152, 252, 28, 20, 154, 251, 244, 200, 153, 111, 185, 36, 39, 174, 65, 228, 100, 155, 147, 76, 164, 149, 153, 27, 120, 82, 184, 85]

got : List U8
got = [36, 185, 111, 153, 200, 244, 251, 154, 20, 28, 252, 152, 66, 196, 176, 227, 85, 184, 82, 120, 27, 153, 149, 164, 76, 147, 155, 100, 228, 65, 174, 39]


── EXPECT FAILED in crates/compiler/builtins/roc/Crypto.roc ────────────────────

This expectation failed:

70│>  expect
71│>      data = ['a', 'b', 'c']
72│>      want = digestBytesOfAbc
73│>      got = data |> hashSha256 |> digest256ToBytes
74│>      want == got

When it failed, these variables had these values:

data : List U8
data = [97, 98, 99]

want : List U8
want = [186, 120, 22, 191, 143, 1, 207, 234, 65, 65, 64, 222, 93, 174, 34, 35, 176, 3, 97, 163, 150, 23, 122, 156, 180, 16, 255, 97, 242, 0, 21, 173]

got : List U8
got = [35, 34, 174, 93, 222, 64, 65, 65, 234, 207, 1, 143, 191, 22, 120, 186, 173, 21, 0, 242, 97, 255, 16, 180, 156, 122, 23, 150, 163, 97, 3, 176]


── EXPECT FAILED in crates/compiler/builtins/roc/Crypto.roc ────────────────────

This expectation failed:

76│>  expect
77│>      data = Str.toUtf8 "abcdefghbcdefghicdefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu"
78│>      want = digestBytesOfLong
79│>      got = data |> hashSha256 |> digest256ToBytes
80│>      want == got

When it failed, these variables had these values:

data : List U8
data = [97, 98, 99, 100, 101, 102, 103, 104, 98, 99, 100, 101, 102, 103, 104, 105, 99, 100, 101, 102, 103, 104, 105, 106, 100, 101, 102, 103, 104, 105, 106, 107, 101, 102, 103, 104, 105, 106, 107, 108, 102, 103, 104, 105, 106, 107, 108, 109, 103, 104, 105, 106, 107, 108, 109, 110, 104, 105, 106, 107, 108, 109, 110, 111, 105, 106, 107, 108, 109, 110, 111, 112, 106, 107, 108, 109, 110, 111, 112, 113, 107, 108, 109, 110, 111, 112, 113, 114, 108, 109, 110, 111, 112, 113, 114, 115, 109, 110, 111, 112, 113, 114, 115, 116, 110, 111, 112, 113, 114, 115, 116, 117]

want : List U8
want = [207, 91, 22, 167, 120, 175, 131, 128, 3, 108, 229, 158, 123, 4, 146, 55, 11, 36, 155, 17, 232, 240, 122, 81, 175, 172, 69, 3, 122, 254, 233, 209]

got : List U8
got = [55, 146, 4, 123, 158, 229, 108, 3, 128, 131, 175, 120, 167, 22, 91, 207, 209, 233, 254, 122, 3, 69, 172, 175, 81, 122, 240, 232, 17, 155, 36, 11]


── EXPECT FAILED in crates/compiler/builtins/roc/Crypto.roc ────────────────────

This expectation failed:

82│>  expect
83│>      want = digestBytesOfEmpty
84│>      got = emptySha256 {} |> sha256Digest |> digest256ToBytes
85│>      want == got

When it failed, these variables had these values:

want : List U8
want = [227, 176, 196, 66, 152, 252, 28, 20, 154, 251, 244, 200, 153, 111, 185, 36, 39, 174, 65, 228, 100, 155, 147, 76, 164, 149, 153, 27, 120, 82, 184, 85]

got : List U8
got = [36, 185, 111, 153, 200, 244, 251, 154, 20, 28, 252, 152, 66, 196, 176, 227, 85, 184, 82, 120, 27, 153, 149, 164, 76, 147, 155, 100, 228, 65, 174, 39]


── EXPECT FAILED in crates/compiler/builtins/roc/Crypto.roc ────────────────────

This expectation failed:

87│>  expect
88│>      data = ['a', 'b', 'c']
89│>      want = digestBytesOfAbc
90│>      got =
91│>          emptySha256 {}
92│>          |> sha256AddBytes data
93│>          |> sha256Digest
94│>          |> digest256ToBytes
95│>      want == got

When it failed, these variables had these values:

data : List U8
data = [97, 98, 99]

want : List U8
want = [186, 120, 22, 191, 143, 1, 207, 234, 65, 65, 64, 222, 93, 174, 34, 35, 176, 3, 97, 163, 150, 23, 122, 156, 180, 16, 255, 97, 242, 0, 21, 173]

got : List U8
got = [35, 34, 174, 93, 222, 64, 65, 65, 234, 207, 1, 143, 191, 22, 120, 186, 173, 21, 0, 242, 97, 255, 16, 180, 156, 122, 23, 150, 163, 97, 3, 176]


── EXPECT FAILED in crates/compiler/builtins/roc/Crypto.roc ────────────────────

This expectation failed:

 97│>  expect
 98│>      want = digestBytesOfAbc
 99│>      got =
100│>          emptySha256 {}
101│>          |> sha256AddBytes ['a']
102│>          |> sha256AddBytes ['b']
103│>          |> sha256AddBytes ['c']
104│>          |> sha256Digest
105│>          |> digest256ToBytes
106│>      want == got

When it failed, these variables had these values:

want : List U8
want = [186, 120, 22, 191, 143, 1, 207, 234, 65, 65, 64, 222, 93, 174, 34, 35, 176, 3, 97, 163, 150, 23, 122, 156, 180, 16, 255, 97, 242, 0, 21, 173]

got : List U8
got = [35, 34, 174, 93, 222, 64, 65, 65, 234, 207, 1, 143, 191, 22, 120, 186, 173, 21, 0, 242, 97, 255, 16, 180, 156, 122, 23, 150, 163, 97, 3, 176]


6 failed and 39 passed in 568 ms.

I had a brief crack at the suggestion from @bhansconnect to change to [7]u128 align(16) but my zig foo isn't up to scratch.

lukewilliamboswell avatar Nov 20 '24 00:11 lukewilliamboswell

I hadn't noticed this was in progress again. I will see if I can work out what is going on

MatthewJohnHeath avatar Nov 24 '24 15:11 MatthewJohnHeath

@lukewilliamboswell I meant to rebase all of this to the current upstream and then force push, but I messed it up and lost your 3 recent commits. Sorry. Could you please push them again please?

MatthewJohnHeath avatar Nov 24 '24 18:11 MatthewJohnHeath

@MatthewJohnHeath -- thank you for your patience with this one. 😃

I'm really looking forward to having this in the builtins, and laying the foundation for more cryptographic primitives.

lukewilliamboswell avatar Nov 29 '24 21:11 lukewilliamboswell

The bug was it converting u128 to bytes as big-endian, which was (almost always) wrong Now it converts them as little-endian and the tests pass (on my computer, and presumably almost all others). But it probably should also support builds for big-endian machines @lukewilliamboswell would trying to merge it like this and raising a separate ticket to fix it for big-endian architectures be reasonable?

MatthewJohnHeath avatar Dec 01 '24 16:12 MatthewJohnHeath

I'm having some issues with memory running tests locally. There seems to be a failure in crates/repl_test/src/tests.rsand I can't see how it would be related

MatthewJohnHeath avatar Dec 09 '24 11:12 MatthewJohnHeath

I am having some trouble running test locally (insufficient disc space). There seems to be a failure in interpolation_with_nested_strings. I can't see how that could be related.

MatthewJohnHeath avatar Dec 09 '24 11:12 MatthewJohnHeath

I am having some trouble running test locally (insufficient disc space). There seems to be a failure in interpolation_with_nested_strings. I can't see how that could be related.

Oh I think it must be related. I didn't realise how early in the process that test was. I will try to unpick it sometime this week

MatthewJohnHeath avatar Dec 09 '24 13:12 MatthewJohnHeath

Going to close this PR given the Zig rewrite

isaacvando avatar Feb 14 '25 22:02 isaacvando