go-multibase icon indicating copy to clipboard operation
go-multibase copied to clipboard

Added Base8 Implementation

Open gowthamgts opened this issue 6 years ago • 14 comments

gowthamgts avatar Nov 23 '18 03:11 gowthamgts

Should creating a base8_test file would be the best case to increase coverage?

gowthamgts avatar Nov 25 '18 07:11 gowthamgts

ping @Stebalien

gowthamgts avatar Nov 26 '18 15:11 gowthamgts

I've fixed the recommended changes in code.

(I'd also kind of like to know why you need octal before we go through the process of defining and implementing it)

I was reading the code and found out base8 was missing and I had some time so. 😕

gowthamgts avatar Nov 27 '18 05:11 gowthamgts

I was reading the code and found out base8 was missing and I had some time so.

Fair enough. That's also why we have base 2... But we'll still need some spec first.

Stebalien avatar Nov 27 '18 17:11 Stebalien

Totally understandable. Thanks for your time.

gowthamgts avatar Nov 28 '18 06:11 gowthamgts

@Stebalien: Can I implement base2 encoding in reference with RFCs?

gowthamgts avatar Dec 10 '18 11:12 gowthamgts

Go ahead.

Stebalien avatar Dec 10 '18 23:12 Stebalien

Regarding spec for base 8, I gave this some thought.

For bases that evenly break bytes into characters, pad to full bytes, this is base 2 (8 chars), base 4 (4 chars), and base 16 (2 chars). It makes sense for these to treat as bitstreams and pad out to whole bytes.

But for other power of two bases that don't evenly fit into a byte, use optional padding at the end. Examples are base64 (3 bytes = 4 chars), and base32 (5 bytes = 8 chars).

Base 8 (3 bytes = 8 chars) could fit in this style encoding.

Another option and the one JS currently uses is to convert to a large number and make leading zeroes represent null bytes similar to base 10 and base 58.

creationix avatar May 01 '19 04:05 creationix

I wrote a base-8 codec that works similar to base-32 and base-64 where you give it an alphabet and an optional padding character. https://github.com/filecoin-project/lua-filecoin/blob/master/base-8.lua

For example base8 with '01234567=' as alphabet using same style as base-32 and base-64:

  • Decentralize everything!! -> 72106254331267164344605543227514510062566312711713506415133463441102=====
  • hello world - 7320625543306744035667562330620==

But if I instead use base-x (which is what the JS implementation currently does), it looks closer to the current test vectors, but different leading zeroes:

  • Decentralize everything!! -> 71043126154533472162302661513646244031273145344745643206455631620441
  • hello world - 764145330661571007355734466144

creationix avatar May 02 '19 23:05 creationix

@creationix I'm fine with either but I'd like to go with what's commonly used in the community. Have you found any other users of base8?

Stebalien avatar Jul 24 '19 21:07 Stebalien

I've not seen any others. I don't know if there is a common encoding for this. Logically, the same style as base-64 and base-32 makes the most sense.

creationix avatar Jul 26 '19 14:07 creationix

So, the real question is, should we even bother? A viable option is to just drop base8.

Stebalien avatar Jul 26 '19 18:07 Stebalien

Personally I see base-2 and base-8 both as unneeded. Is there any use case where they are the correct solution? Base-16 works everywhere and encodes much shorter and easier than them.

My recommendation is to either drop them to reduce the maintenance overhead for implementations or to go with the logical encoding as I've suggested if we must keep base-8.

creationix avatar Aug 01 '19 17:08 creationix

https://github.com/multiformats/multibase/issues/59

Stebalien avatar Aug 01 '19 17:08 Stebalien