sjcl
sjcl copied to clipboard
Incorrect SHA256 calculation?
It seems that some characters are not correctly encoded (or there is some specification missing):
$ wget https://bitwiseshiftleft.github.io/sjcl/sjcl.js
$ nodejs
> var sjcl = require('./sjcl.js')
undefined
> sjcl.codec.hex.fromBits(sjcl.hash.sha256.hash(String.fromCharCode(157)))
'a5b17dce62f930fc353a53ddf965a5edcae3932ea48f251b1ab795f0357ffd14'
while
$ echo -n -e "\0235" | shasum -a 256
9d277175737fb50041e75f641acf94d10df9b9721db8fffe874ab57f8ffb062e -
and using jssha
> SHA256_hash(String.fromCharCode(157))
'9d277175737fb50041e75f641acf94d10df9b9721db8fffe874ab57f8ffb062e'
Am I missing something?
It seems to me that sjcl.hash.sha256.hash actually supports only ASCII characters. I tested with a big loop - thousand of chars - and it stops working in the right manner after char code 127.
To be sure about the validity of the computed hash, the only way I found if you want to use that funciotn is to convert the string to base64 before submitting it to the hash method. Something like this:
> var aString = new Buffer(String.fromCharCode(157)).toString('base64')
undefined
> aString
'wp0='
> sjcl.codec.hex.fromBits(sjcl.hash.sha256.hash(aString))
'50a67c737f7cc2a0d131304f6087bd3d54dce7658216bb42c5647c1aba39b68e'
> SHA256_hash(aString)
'50a67c737f7cc2a0d131304f6087bd3d54dce7658216bb42c5647c1aba39b68e'
I don't know if this is a bug but, if it is not, this notice about ASCII should be placed in evidence inside the docs.
A curiosity. If you try to revert from base64 to char and you apply shasum from linux, can you guess the result? ^^
echo -n -e 'wp0=' | base64 -d | shasum -a 256
a5b17dce62f930fc353a53ddf965a5edcae3932ea48f251b1ab795f0357ffd14 -
And one last note. To revert the string from base64 you can do something like this:
> new Buffer(aString, 'base64').toString('utf8')
Hmm, something's weird here.
> new Buffer(String.fromCharCode(157)).toString('base64')
'wp0='
> btoa(String.fromCharCode(157))
"nQ=="
> atob('nQ==').charCodeAt(0)
157
> atob('wp0=').charCodeAt(0)
194
> atob('wp0=').charCodeAt(1)
157
> sjcl.codec.hex.fromBits(sjcl.codec.base64.toBits('nQ=='))
"9d" // 157
> sjcl.codec.hex.fromBits(sjcl.codec.base64.toBits('wp0='))
"c29d" // 194, 157
> new Buffer(String.fromCharCode(127)).toString('base64')
'fw=='
> sjcl.codec.hex.fromBits(sjcl.codec.base64.toBits('fw=='))
"7f" // 127
> new Buffer(String.fromCharCode(128)).toString('base64')
'woA='
> sjcl.codec.hex.fromBits(sjcl.codec.base64.toBits('woA='))
"c280" // 194, 128
So sjcl's hash is hashing hex value c29d
and node's Buffer produces the base64 encoding of that value as well. But JS's btoa
produces the base64 encoding of 9d
, or the decimal number 157
and other hashes hash that value as well.
Did I understand correctly:
- sha 256 is definitly working as it should
- the problem lies in character to bit conversion, sometimes a 194 is not added by other atob but sjcl's and node-buffer
According to this table the c2 has to be added by the way: http://www.fileformat.info/info/charset/UTF-8/list.htm
Yes, it seems that way.
sha256.hash
calls utf8String.toBits
, which calls encodeURIComponent
> encodeURIComponent(String.fromCharCode(157))
"%C2%9D"
So idk who's right or wrong, but that seems legitimate.
If the following line is removed from utf8String.toBits
, the output of sha256.hash
is @pveber's expected 9d277175737fb50041e75f641acf94d10df9b9721db8fffe874ab57f8ffb062e
str = unescape(encodeURIComponent(str));
Then the following line would have to be changed in utf8String.fromBits
return decodeURIComponent(escape(out));
to
return out;
But I'm not aware of the ramifications of that.
I created a library for converting back and forth between utf8 because I was encountering these utf-8 encoding issues:
https://github.com/coolaj86/unibabel-js
You could copy and paste the few lines you need for various functions: https://github.com/coolaj86/unibabel-js/blob/master/index.js https://github.com/coolaj86/unibabel-js/blob/master/unibabel.hex.js
// TypedArray <--> UTF8
var uint8Array = Unibabel.strToUtf8Arr(str);
var str = Unibabel.utf8ArrToStr(uint8Array);
// TypedArray <--> Hex
var uint8Array = Unibabel.hexToBuffer(hexstr);
var str = Unibabel.bufferToHex(uint8Array);
// TypedArray <--> Base64
var base64 = Unibabel.arrToBase64(uint8Array)
var uint8Array = Unibabel.base64ToArr(base64)