exporting .vcf : folding counts unicode codepoints but the definition tells 75 octets

Open bernieEV opened this issue 3 years ago • 3 comments

lines with multi-byte content get too long because of unicode codepoint counting instead of byte/octet counting.

Oct 01 '22 08:10 bernieEV

can you give some example?

Oct 01 '22 09:10 tibbi

tibor.txt one card with many multi-byte symbols in UTF8 coding

Oct 01 '22 16:10 bernieEV

Hi Tibor, kotlin has methode "code" telling the codepoint value of a Unicode symbol. there is a clear way to tell how many octets a Unicode symbol will take after UTF-8 transformation. US-ASCII is untouched 0x0..0x7F takes 1 octet per symbol 0x80..0x7FF takes 2 octet per symbol 0x800..0xFFFF takes 3 octets per symbol 0x10000..0x1FFFFF takes 4 octets per symbol

Nov 20 '22 15:11 bernieEV