tiny-utf8 icon indicating copy to clipboard operation
tiny-utf8 copied to clipboard

is it possible to make tiny-utf8 case insensitive?

Open rdev1983 opened this issue 2 years ago • 4 comments

For example we can make a separate typedef using std::string class to behave like case insensitive class. I tried same approach with tiny-utf8 class and i got too many errors. following is the code which makes std::string derived class to behave like case insensitive. any clue?


struct ci_char_traits : public char_traits<char> {
    static bool eq(char c1, char c2) { return toupper(c1) == toupper(c2); }
    static bool ne(char c1, char c2) { return toupper(c1) != toupper(c2); }
    static bool lt(char c1, char c2) { return toupper(c1) <  toupper(c2); }
    static int compare(const char* s1, const char* s2, size_t n) {
        while( n-- != 0 ) {
            if( toupper(*s1) < toupper(*s2) ) return -1;
            if( toupper(*s1) > toupper(*s2) ) return 1;
            ++s1; ++s2;
        }
        return 0;
    }
    static const char* find(const char* s, int n, char a) {
        while( n-- > 0 && toupper(*s) != toupper(a) ) {
            ++s;
        }
        return s;
    }
};

typedef std::basic_string<char, ci_char_traits> ci_string;

rdev1983 avatar Sep 16 '21 11:09 rdev1983

Hey, I will figure out, what I can do for you! I might need some time to reply, but I won't forget.

DuffsDevice avatar Nov 09 '21 16:11 DuffsDevice

Hi Jakob,

I happen to have a lookup table, including the accented characters, the funky variants, and exceptions like the Turkish dotless i. I can email you the source code, if you like.

vadim-berman avatar Nov 10 '21 01:11 vadim-berman

Hi Vadim, sure! I will have a look at it and see, how it might be include. What License is it under?

DuffsDevice avatar Nov 15 '21 16:11 DuffsDevice

Thanks, Jakob.

As a compilation, it's ours (traversal of Unicode tables + manual changes and proofreading), so whatever license you prefer, I guess :) . Nothing fancy, it looks like this:

    LOAD_LATIN_LETTER_PAIR(u8"F", u8"f");
    LOAD_LATIN_LETTER_PAIR(u8"G", u8"g");
    LOAD_LATIN_LETTER_PAIR(u8"H", u8"h");
    if (_standardCode == "tr") {
        LOAD_LATIN_LETTER_PAIR(u8"İ", u8"i");
        LOAD_LATIN_LETTER_PAIR(u8"I", u8"ı");
    } else {
        LOAD_LATIN_LETTER_PAIR(u8"I", u8"i");
    }

vadim-berman avatar Nov 16 '21 03:11 vadim-berman