js-spark-md5 icon indicating copy to clipboard operation
js-spark-md5 copied to clipboard

Provide functionality for calculating MD5 hashes of files

Open janko opened this issue 7 years ago • 3 comments

First of all, thanks a lot for creating this very useful library! 🙏

I recently needed to calculate an MD5 hash of a File object, and while I saw the section in the README showing how to do that, I really didn't like how much custom code it involves.

I was wondering, could maybe this functionality be part of this library? In comparison, Ruby has a Digest::MD5 class which supports calculating hash from a single string, incremental hashing in chunks, and calculating a hash from a file on disk.

Digest::MD5.hexdigest("string")
# or
md5 = Digest::MD5.new
md5.update("chunk1")
md5.update("chunk2")
md5.hexdigest
# or
Digest::MD5.file("/path/to/file").hexdigest

I took me quite a while to find a JavaScript library which simplifies reading a File object in chunks – chunked-file-reader – and it appears to work correctly (I get the same MD5 hash as with the snippet in the README here). So I came up with the following function:

function fileMD5 (file) {
  return new Promise(function (resolve, reject) {
    var spark  = new SparkMD5.ArrayBuffer(),
        reader = new ChunkedFileReader();

    reader.subscribe('chunk', function (e) {
      spark.append(e.chunk);
    });

    reader.subscribe('end', function (e) {
      var rawHash    = spark.end(true);
      var base64Hash = btoa(rawHash);

      resolve(base64Hash);
    });

    reader.readChunks(file);
  })
}

Since it took me a while to come up with this solution, I was wondering if it made sense to have that built into spark-md5.

janko avatar Apr 25 '18 17:04 janko

If not, I think it would be nice to show this example in the README, so that people are more willing to copy-paste it into their projects.

janko avatar Apr 25 '18 17:04 janko

Hello @janko-m. This library was primally made to be used in browser like environments. While it works in node, using the native crypto module will be much faster.

Having a method to read files and calculate the hash out of it would have to cater how to really read the files based on the environment: browser-like or node. Because of that, I don't think it makes much sense to have that built-in.

I'm willing to improve the README to make it clear how to calculate the hash of a file in both browser and node environments. Could you make a PR to add examples of both environments? The current example is for a browser environment.

Makes sense?

satazor avatar Apr 26 '18 22:04 satazor

Hey @satazor, thanks for a quick answer.

Having a method to read files and calculate the hash out of it would have to cater how to really read the files based on the environment: browser-like or node. Because of that, I don't think it makes much sense to have that built-in.

I was under the impression that this library was already considered "browser-only", because, as you said, for Node there is already the crypto module (and hasha which uses it). So, I'm not sure I fully understand, if the functionality for calculating a hash from a JavaScript File object is added, why would it mean that it should also support Node? If that's really the case, then I agree that it wouldn't make much sense.

I'm willing to improve the README to make it clear how to calculate the hash of a file in both browser and node environments. Could you make a PR to add examples of both environments? The current example is for a browser environment.

My intention was simplifying only the browser example that's already there, as I only have experience with using spark-md5 in the browser. Great, I'll send the PR then 👍

janko avatar Apr 27 '18 12:04 janko