cista icon indicating copy to clipboard operation
cista copied to clipboard

Support for std::u16string.

Open Gringoniceguy opened this issue 3 years ago • 9 comments

Could it be possible to add support for more basic string types like std::wstring, std::u16string, std::u32string. Pretty much it, i would add this my self but im confused as to how the string gets turned to bytes. Is it safe to just add more genric_string defines for std::u16string?

Gringoniceguy avatar Dec 03 '20 18:12 Gringoniceguy

I think the first step would be to go through string.h and replace each char occurrence by a template parameter indicating the character type. Then, it should be easy to defined a cista::raw::u16string as cista::generic_string<char16_t, char16_t*>.

Hardcoded positions like 15 as the last character should of course be computed depending on how many characters can be placed inline (within sizeof(generic_string)): https://github.com/felixguendling/cista/blob/4977b9037310f13268016962a9816c46b55de82b/include/cista/containers/string.h#L301 https://github.com/felixguendling/cista/blob/4977b9037310f13268016962a9816c46b55de82b/include/cista/containers/string.h#L323 https://github.com/felixguendling/cista/blob/4977b9037310f13268016962a9816c46b55de82b/include/cista/containers/string.h#L83 etc.

We should also add tests to verify everything is working with the "new" string.

felixguendling avatar Dec 03 '20 21:12 felixguendling

I have looked into and this seems pretty complex and i dont know how to only say pass in const char* but also get char* and char, but why dont we just use std::basic_string and genric string dont they serve the same purpose? They are able to create all the standard strings and would be much easier

Gringoniceguy avatar Dec 04 '20 21:12 Gringoniceguy

why dont we just use std::basic_string and genric string dont they serve the same purpose?

Basically, yes. However, it is possible that std::string in the libstdc++, MSVC std lib and libc++ have each a different binary layout. This makes it impossible to use as a serialization format which should be independent from the compiler and standard library combination you use. For serialization we need access to internal pointers, fields, etc. which may change in upcoming versions. The serialization approach of Cista can only be implemented in a stable and highly performant way for data structures owned by Cista itself.

felixguendling avatar Dec 05 '20 17:12 felixguendling

So what do you say to taking and modifying the code of the standard libary for highly fast and tried and tested motheoed and then remove any things that we dont need?

Gringoniceguy avatar Dec 05 '20 17:12 Gringoniceguy

Of which standard library? There are several implementations. They are all quite complex. I think this would be a lot of work. It would certainly be easier to just do some small changes to support u16string, etc. in Cista itself.

felixguendling avatar Dec 05 '20 17:12 felixguendling

Hm probably but i am not really sure how the libary works i have tried to change it but it seems quite difficult, i dont really know how to.

Gringoniceguy avatar Dec 05 '20 17:12 Gringoniceguy

Or well i do know how to the issue is i might need another template which would require alot of changes, Because i need the template also contain the base type i.e template<typename Ptr = const char*, typename Base = char> idk how to do this.

Gringoniceguy avatar Dec 05 '20 18:12 Gringoniceguy

I'll look into it.

Maybe for now, you can use cista::vector<char16_t>?

felixguendling avatar Dec 05 '20 19:12 felixguendling

Yeah i have tested this before just annoying having to use a vector and make a string from it instead of built in types, this will be fine for the time being i will just add todos foreach line.

Gringoniceguy avatar Dec 05 '20 20:12 Gringoniceguy