cbor-js icon indicating copy to clipboard operation
cbor-js copied to clipboard

Fix handling of codepoints >= 0xe000

Open svaarala opened this issue 5 years ago • 4 comments

Codepoints in [U+E000,U+FFFF] were incorrectly handled by the surrogate pair code path.

Example of behavior before the fix:

duk> CBOR.encode('\ue000')
= |64f0908080|
duk> CBOR.decode(CBOR.encode('\ue000'))
= "\ud800\udc00"
duk> CBOR.decode(CBOR.encode('\uffff'))
= "\udbff\udc00"

After the fix:

duk> CBOR.encode('\ue000')
= |63ee8080|
duk> CBOR.decode(CBOR.encode('\ue000'))
= "\ue000"
duk> CBOR.decode(CBOR.encode('\uffff'))
= "\uffff"

Cbor.me gives the following for encoding "\ue000", matching the fixed output:

63        # text(3)
   EE8080 # "\xEE\x80\x80"

svaarala avatar Jun 07 '19 23:06 svaarala

Coverage Status

Coverage remained the same at 99.262% when pulling 0f1ed4ad1fb48bdf6cba529b283bcb74a6c8cb9e on svaarala:fix-cp-above-dfff into 65dc49611107db83aff8308a6b381f4d7933824b on paroga:master.

coveralls avatar Jun 08 '19 00:06 coveralls

This bug is the cause of failure to properly roundtrip emoji with modifiers such as ☝️ and 👆🏿, which are both U+261D with modifier️s U+FE0F or U+1F3FF respectively.

When I apply this change the en/decode roundtrip is successful. Without the change it fails to encode the modifier.

Example code: let value = '_☝️_👆🏿_'; console.log(value, cbor.decode(cbor.encode(value))); Without fix: Screen Shot 2019-08-17 at 11 35 00 AM

With fix: Screen Shot 2019-08-17 at 11 35 32 AM

TrevorFSmith avatar Aug 17 '19 18:08 TrevorFSmith

@paroga It looks like this lib needs some love. Would you be up for handing the reins to someone else either temporarily or permanently? I'd be happy to review the existing PRs and Issues, either in this repo or (if you'd prefer) in a repo under my work org.

TrevorFSmith avatar Aug 17 '19 18:08 TrevorFSmith

I'd also be happy to help in whatever way I can. I'm working with Duktape's native CBOR binding so I could help at least with bug fixing and maybe improve test coverage (Duktape's CBOR binding is based on this library, so many of the testcases in Duktape repo can be used as is).

svaarala avatar Aug 17 '19 18:08 svaarala