skyhook icon indicating copy to clipboard operation
skyhook copied to clipboard

Skyhook clobbers binary data

Open shanipribadi opened this issue 3 years ago • 5 comments

Skyhook stores values as Aerospike StringValue, using ByteArray.toString(Charsets.UTF_8)

This causes binary data to be clobbered/corrupted when stored and read again.

I propose to store them as BytesValue instead, so that no conversion is done and data is stored as is.

this aligns with the specification of redis "string" as binary safe https://redis.io/topics/data-types

shanipribadi avatar May 29 '21 10:05 shanipribadi

The only possible binary data would be one that contains the null byte \0. Do you some code reproduction of how anything else is clobbered? That would be helpful.

rbotzer avatar May 30 '21 20:05 rbotzer

@shanipribadi please add an example to reproduce the data corruption.

reugn avatar May 31 '21 12:05 reugn

@reugn Shani reports that the problem is when he persists messagepacked data, it gets identified as a string and mangled a bit. Instead, it should be identified as a blob and saved as one.

@shanipribadi we still need some sample code, please.

rbotzer avatar Jun 09 '21 03:06 rbotzer

Hello, apologies for the very late update

Please take a look at this repo: https://github.com/shanipribadi/skyhookcomp it has the necessary test case to reproduce the issue.

it's implemented using the same library we're using in production (go-redis + msgpack).

You can try inserting \x83 or any other invalid utf-8 bytes, and you will see that skyhook+aerospike clobbers it, and converts it to \xef\xbf\xbd when returning the value again.

shanipribadi avatar Oct 13 '21 13:10 shanipribadi

Thanks @shanipribadi. The issue has been resolved. Check out the main branch to have the fix.

reugn avatar Oct 21 '21 15:10 reugn