ssz icon indicating copy to clipboard operation
ssz copied to clipboard

toHexString util: try String.fromCharCode api

Open twoeths opened this issue 2 years ago • 5 comments

Is your feature request related to a problem? Please describe.

Refer to https://github.com/achingbrain/uint8arrays/issues/30#issuecomment-1199120924

Given how efficient protobuf creates a string from Uint8Array in one go, we could try to use String.fromCharCode() api for our toHexString() function without using string concatenation (which create temporary strings that cause gc run more frequently)

Describe the solution you'd like

  • Create a function to map a number from 0 to 15 to char code
  • For each byte extract to 1st 4-bits uint and 2nd 4-bits uint
  • Combine to an array of char codes
  • Create string from there in one go

twoeths avatar Jul 29 '22 10:07 twoeths

For NodeJS where performance is important we should just use Buffer utils which will probably be the faster and more memory efficient.

Given how efficient protobuf creates a string from Uint8Array in one go

Efficient in terms of CPU time, memory or what specifically?

dapplion avatar Jul 29 '22 12:07 dapplion

Efficient in terms of CPU time, memory or what specifically?

in CPU time, but I guess could be better in terms of memory too since there is no created strings in the middle (need more benchmarks to see)

export function toHexString2(bytes: Uint8Array): string {
  const chunks = new Array<number>(bytes.length * 2 + 2);
  chunks[0] = 48;
  // "x".charCodeAt(0)
  chunks[1] = 120;
  for (let i = 0; i < bytes.length; i++) {
    const byte = bytes[i];
    const first = (byte & 0xf0) >> 4;
    const second = byte & 0x0f;

    // "0".charCodeAt(0) = 48
    // "a".charCodeAt(0) = 97 => delta = 87
    chunks[2 + 2 * i] = first < 10 ? first + 48 : first + 87;
    chunks[2 + 2 * i + 1] = second < 10 ? second + 48 : second + 87;
  }
  // return String.fromCharCode.apply(String, chunks);
  return String.fromCharCode(...chunks);
}

some quick benchmarks:

  toHexString vs String.fromCharCode
    ✓ fromCharCode                                                        320.1892 ops/s    3.123153 ms/op        -      19043 runs   60.0 s
    ✓ toHexString                                                         211.8084 ops/s    4.721247 ms/op   x0.985      12598 runs   60.0 s
    ✓ Buffer.toString hex                                                 342.6784 ops/s    2.918188 ms/op   x0.995      20385 runs   60.0 s

twoeths avatar Jul 30 '22 06:07 twoeths

@tuyennhv For memory efficiency there's this library that flattens strings. Check it out it's magic https://github.com/davidmarkclements/flatstr

dapplion avatar Jul 30 '22 21:07 dapplion

The fromCharCode approach looks good. Probably the best browser-compatible implementation

wemeetagain avatar Aug 01 '22 16:08 wemeetagain

@tuyennhv For memory efficiency there's this library that flattens strings. Check it out it's magic https://github.com/davidmarkclements/flatstr

Note this was recommended by Ben (the libuv mantainer)

dapplion avatar Aug 02 '22 14:08 dapplion