node-mysql2 icon indicating copy to clipboard operation
node-mysql2 copied to clipboard

fix(cache): improve cache key serialization

Open wellwelwel opened this issue 1 year ago • 5 comments

This PR improves the serialization of LRU keys when generating the cache 🚀

It also implements additional sanitization to improve the security of cached keys 👩🏻‍🔧


Deciding between possible alternatives

Basic local performance checking for 15,000 large unique keys generated by using:

  • SHA256: ~221ms
  • SHA1: ~209ms
  • Base64: ~60ms
  • Regex: ~54ms
  • Loop each char: ~36ms
  • JSON.stringify: https://github.com/sidorares/node-mysql2/pull/2424#issuecomment-1919672175 (⭐️)
  • No sanitize: ~20ms

Outdated

Tests for this fix don't include MySQL Server. Since it's simple, I just added a deep validation for each type that keys from cache can receive 🔬

Why loop each char instead of an encoding like Base64?

~Although it's much easier to implement the solution using Base64, I don't think this facility is worth ~3x less performance while there is another more effective way of achieving the same result.


Getting into more detail, I have also distinguished between what is actually typeof undefined and what is an undefined parameter.

For example, a parameter undefined will be _, while a typeof undefined will be kept as it is.
This not only emphasizes the uniquenesss, but will also make the key smaller.


Please feel free to discuss on it 🙋🏻‍♂️

wellwelwel avatar Jan 31 '24 16:01 wellwelwel

I did a benchmark of JSON.stringify(arrayOfKeys) as a serialiser and looks like it's about 50% quicker:

https://gist.github.com/sidorares/e735b99b68c33871e74f789e1fa6f5e3

sidorares avatar Jan 31 '24 18:01 sidorares

Thanks @sidorares, you're right.

I've kept some basic sanitization just for the boolean options:

Using this way makes it readable and easier to maintain 🧑🏻‍🔧

function keyFromFields(type, fields, options, config) {
  const res = [
    type,
    typeof options.nestTables,
    options.nestTables,
    Boolean(options.rowsAsArray),
    Boolean(options.supportBigNumbers || config.supportBigNumbers),
    Boolean(options.bigNumberStrings || config.bigNumberStrings),
    typeof options.typeCast,
    options.timezone || config.timezone,
    Boolean(options.decimalNumbers),
    options.dateStrings,
  ];

  for (let i = 0; i < fields.length; ++i) {
    const field = fields[i];

    res.push([
      field.name,
      field.columnType,
      field.length,
      field.schema,
      field.table,
      field.flags,
      field.characterSet,
    ]);
  }

  return JSON.stringify(res, null, 0);
}

Specially because of the , in the array serialization, it's a safer approach too, replacing the current / and :.

I also tried using push(...[]) and Object.assign, but this approach was simpler and more effective.

wellwelwel avatar Jan 31 '24 23:01 wellwelwel

@sidorares, do you think it's a good idea to export the keyFromFields (internally only) to add an unit test?

wellwelwel avatar Feb 01 '24 00:02 wellwelwel

@sidorares, do you think it's a good idea to export the keyFromFields (internally only) to add an unit test?

yeah, probably a good idea

sidorares avatar Feb 01 '24 05:02 sidorares

@sidorares

I'll put the comments above each test here:

  • test1: Invalid (checking for invalid options)
  • test2: Invalid, except for config (global overwriting)
  • test3: Invalid, except for options
  • test4: A test based on results of SELECT * FROM test WHERE value = ?
  • test5: Same from test4, but with invalid booleans need to reach out the same key
  • test6: Forcing delimiters on strings fields
  • test6: Checking for quotes escape
  • test7: Valid: options with true on booleans
  • test8: Expects the same result from test7, but using valid values instead of true on booleans fields
  • test9: Invalid: checking function parser in wrong fields
    • Expect to be null for string
    • Expect to be true for booleans
    • Expect to be "function" for typeof

Additional

  • Try to parse all keys to ensure that the JSON.stringify is generating a valid result.
  • Testing twice all existent tests to ensure that all keys are reused when they repeat.
  • Overwriting the JSON.stringify and test previous results with new generated keys from custom JSON.stringify.

My goal was:

  • Check the delimiter (,)
  • Ensure that the keys will be unique
  • Ensure that same expected results have the same key when they are generated again
  • Add kinds of invalid tests for when a user does something that could break the serialization

Feel free to suggest more cases.

wellwelwel avatar Feb 01 '24 10:02 wellwelwel

@sidorares, I will open a PR just to rebase the tests commits.

The test name needs to be: test/unit/parsers/cache-key-serialization.tes.cjs

wellwelwel avatar Mar 26 '24 05:03 wellwelwel

I edited the message in the "squash & merge" dialog but I think I edited by mistake "long log" and release-please picks short message. Could you make sure the release contains correct attribution before it's created @wellwelwel? I guess we can just edit body of https://github.com/sidorares/node-mysql2/pull/2529

"fix(cache): improve cache key formation. Fixes a potential parser cache poisoning attack vulnerability reported by Vsevolod Kokorin (Slonser) of Solidlab" - https://github.com/sidorares/node-mysql2/commit/0d54b0ca6498c823098426038162ef10df02c818

sidorares avatar Mar 26 '24 06:03 sidorares

I'll also check the change logs after merging this.

wellwelwel avatar Mar 26 '24 06:03 wellwelwel