tiny-utf8
tiny-utf8 copied to clipboard
is it possible to make tiny-utf8 case insensitive?
For example we can make a separate typedef using std::string class to behave like case insensitive class. I tried same approach with tiny-utf8 class and i got too many errors. following is the code which makes std::string derived class to behave like case insensitive. any clue?
struct ci_char_traits : public char_traits<char> {
static bool eq(char c1, char c2) { return toupper(c1) == toupper(c2); }
static bool ne(char c1, char c2) { return toupper(c1) != toupper(c2); }
static bool lt(char c1, char c2) { return toupper(c1) < toupper(c2); }
static int compare(const char* s1, const char* s2, size_t n) {
while( n-- != 0 ) {
if( toupper(*s1) < toupper(*s2) ) return -1;
if( toupper(*s1) > toupper(*s2) ) return 1;
++s1; ++s2;
}
return 0;
}
static const char* find(const char* s, int n, char a) {
while( n-- > 0 && toupper(*s) != toupper(a) ) {
++s;
}
return s;
}
};
typedef std::basic_string<char, ci_char_traits> ci_string;
Hey, I will figure out, what I can do for you! I might need some time to reply, but I won't forget.
Hi Jakob,
I happen to have a lookup table, including the accented characters, the funky variants, and exceptions like the Turkish dotless i. I can email you the source code, if you like.
Hi Vadim, sure! I will have a look at it and see, how it might be include. What License is it under?
Thanks, Jakob.
As a compilation, it's ours (traversal of Unicode tables + manual changes and proofreading), so whatever license you prefer, I guess :) . Nothing fancy, it looks like this:
LOAD_LATIN_LETTER_PAIR(u8"F", u8"f");
LOAD_LATIN_LETTER_PAIR(u8"G", u8"g");
LOAD_LATIN_LETTER_PAIR(u8"H", u8"h");
if (_standardCode == "tr") {
LOAD_LATIN_LETTER_PAIR(u8"İ", u8"i");
LOAD_LATIN_LETTER_PAIR(u8"I", u8"ı");
} else {
LOAD_LATIN_LETTER_PAIR(u8"I", u8"i");
}