blake3 icon indicating copy to clipboard operation
blake3 copied to clipboard

How to compute the Sum512 of a stream of data without fixed known length

Open gianalbertochini opened this issue 1 year ago • 1 comments

Hello,

I’m curious to know if it’s possible to compute the hash of a data stream without knowing its length in advance, and to do this without storing the entire data in RAM. Ideally, the hash should be updated incrementally.

As a beginner, this might seem like a simple question. I understand that to compute the hash, the data needs to be divided into chunks of 1024 bytes.

To put it in simpler terms, I want to write a Hash class that has a method void HashByte(*byte). This method would take an arbitrary number of bytes as input and maintain a “partial hash” in memory, which is updated incrementally every time N bytes arrive from the stream.

Another method, byte[64] Close(void), would return the 512-bit hash as an array of 64 bytes, representing the entire received stream.

for example:

hash = New Hash() hash.HashBytes(byte[] "This is the first array of byte.") hash.HashBytes(byte[] " <A VERY LONG STRING>.") // Here can be added even few MiB of data hash.HashBytes(byte[] "") //Nothing is added hash.HashBytes(byte[] " This is the second") hash.HashBytes(byte[] ".") // Just 1 byte is added byte[64] result0 = hash.close()

byte[64] result1 = Sum512(byte[] "This is the first array of byte. <A VERY LONG STRING>. This is the second.")

result0 should be equal to result1

Is it possible and how can I do?

Many thanks

gianalbertochini avatar Jun 02 '24 16:06 gianalbertochini

sure, here's how to do that in Go:

h := blake3.New(64, nil) // 512-bit output, no key
h.Write([]byte("This is the first array of byte."))
h.Write([]byte("<A VERY LONG STRING>"))
result := h.Sum(nil)

In Go, we typically use the io.Reader and io.Writer interfaces when working with lots of data. For example, if your data is stored in a file, you could do this:

f, _ := os.Open("path/to/file")
h := blake3.New(64, nil)
h.Write([]byte("This is the first array of byte."))
io.Copy(h, f) // stream the file contents into the hash
result := h.Sum(nil)

This works because io.Copy streams data from an io.Reader (in this case, f, the file) to an io.Writer (in this case, h, the hash).

Hope this helps!

lukechampine avatar Jun 02 '24 20:06 lukechampine