efrt icon indicating copy to clipboard operation
efrt copied to clipboard

Could we get support for strings with numbers

Open dkrutsko opened this issue 2 years ago • 3 comments

We are looking to use this library for storing a large collection of domains, some of which have numbers (e.g. 101domain.com). Would it be possible to add support for numbers in this library?

We found a workaround by encoding the numbers with special characters but it would be nicer to not have to do that. Furthermore, if the values are just an array, the unpack function should probably just have the option of returning an Array or Set, instead of an object.

import { unpack } from 'efrt';

const chars =
{
	')': '0',
	'~': '1',
	'@': '2',
	'#': '3',
	'$': '4',
	'%': '5',
	'^': '6',
	'&': '7',
	'*': '8',
	'(': '9',
};

let obj = unpack (packed);
obj = Object.keys (obj).join ('\n').replace (/[\)\~\@\#\$\%\^\&\*\(]/g, (m) => chars[m]);
const set = new Set ([ ...obj.split ('\n') ]);

console.log (set.has ('101domain.com'));

dkrutsko avatar Apr 02 '23 07:04 dkrutsko

oh hey cool - sounds like a neat project. Yeah, I agree. The way it quietly fails on some characters right now is bad. Is it possible to have a pipe character or a semicolon in a domain? I didn't think through this enough, when making this. Be careful with your example that you are not converting any intended ) characters to 0s, etc

I also agree the array output thing is awkward.

hmm. Sounds like we're making an API change. I agree it's due. if you have any ideas how this could look, i'm open to them. Otherwise I'll have to noodle on this a bit cheers

spencermountain avatar Apr 03 '23 13:04 spencermountain

yeah, your encoding example can work because your input can be checked ahead of time. It's a fine solution, as long as you're careful about not also using these additional encoding characters as inputs.

prepending backslashes like '\5' in the compressed text is be a better solution, but will require a fair amount of complexity within the library. it may throw-off the indexes and stuff. Then it's how do you encode a backslash, etc. - could be a real doozie

spencermountain avatar Apr 03 '23 13:04 spencermountain

Yeah, I thought about escape characters, such as \a for 0 and \b for 1, however then you are using double the number of characters to represent a single character, which will bloat the size a bit. Though one thing that came to my mind is what if the final output was not stored as text, but instead was stored as binary data. Similar to how MessagePack does it. And if you ever wanted to convert it back to text, you could just encode it using Base64. Regardless, I implemented my own version of the unpack function in TypeScript to suit my purpose, it returns a Set. Perhaps someone will find it useful. I also modified the packer to drop the true¦ portion from the beginning.

dkrutsko avatar Apr 03 '23 20:04 dkrutsko