wincompose icon indicating copy to clipboard operation
wincompose copied to clipboard

"Impossible sequences" masterlist

Open JapanYoshi opened this issue 2 years ago • 0 comments

I thought that instead of having disparate issues all about different untypable sequences (#307 #337 #398 #417 #431), having one big one would be better for making a decision on what to do about these.

Alphabetic presentation forms

  • U+FB05 LATIN SMALL LIGATURE LONG S T: ſt impossible; no ſ key. ft evaluates to same symbol.
  • All the dotted Hebrew letters. As far as I know the various dot symbols cannot be typed even on a Hebrew keyboard. Suggestion: Use number keys. Although it could be ad hoc and hard to remember, it would at least make it usable.

Arabic

  • All the sequences. The Arabic keyboard has the diacritics, but the default behavior is to combine the diacritics, so WinCompose doesn't have to do anything.

Arrows

  • U+219A LEFTWARDS ARROW WITH STROKE, U+219B RIGHTWARDS ARROW WITH STROKE, and U+21AE LEFT RIGHT ARROW WITH STROKE. The only sequences require standalone keys for ←, →, and ↔, respectively. Suggestion: Free up one of the backslash sequences </ and /<.

Basic Latin

  • I mean, I get that the point is to account for international keyboards without all the ASCII symbols, but do we really need this many (32) sequences for ASCII symbols?

Bengali

  • The Bengali keyboard already has keys for VOWEL SIGN O and VOWEL SIGN AU, and the rest are possible with Shift and AltGr already. These should be removed.

CJK Unified Ideographs Extension B

  • Those aren't Chinese ideograms. Those are emoji.

Cyrillic

  • U+040E CYRILLIC CAPITAL LETTER SHORT U - mixes Latin and Cyrillic letters. Deemed unnecessary by existing sequence ЬУ. Same with all other sequences for letters with a breve.
  • If we're adding Cyrillic cross-language support, why not add a sequence for the Ukrainian or Serbian letters using only Russian letters? e.g. і ← , ї ← 1", ґ ← г', є = ээ?

Devanagari

  • Same deal with Bengali.

Enclosed CJK letters and months

  • Everything from U+3280 CIRCLED IDEOGRAPH ONE to U+32B0 CIRCLED IDEOGRAPH NIGHT. In both Japanese and Chinese, these ideograms do not have dedicated keys. (If we did, the keyboard would be a massive monstrosity.) Remove.
  • Everything from U+32D0 CIRCLED KATAKANA A to U+32FE CIRCLED KATAKANA WO. We don't have dedicated keys for katakana. Remove, as the IME can convert these symbols for us.

Enclosed ideographic supplement

  • All of these contain curly Unicode quotes, which don't have dedicated keys. Remove the sequences or make them less verbose.

Greek and Coptic

  • U+0930 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS - ϊ doesn't have a dedicated key. Besides, the Greek keyboard layout already has dead keys for Tonos, Dialytika, and Dialytika-and-Tonos, so all the sequences for those letters are unnecessary.

Greek extended

  • U+1F04 GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA - Accented letters don't have a dedicated key. Remove all sequences containing accented letters.

Gurmukhi

  • Same deal with Bengali and Devanagari.

Hangul jamo

  • U+113D HANGUL CHOSEONG CHITUEUMSSANGSIOS and U+113F HANGUL CHOSEONG CEONGCHIEUMSSANGSIOS. Both letters are archaic, and the sequence also uses archaic letters (U+113C and 113E). Remove or replace with sequences using just the normal letter sios. Same deal with U+1146, 114F, 1151, 11C3, 11D5, 11D7, 11D9, 11DF, 11F1, 11F2. (Just run that whole block through a lookup of Korean keyboard layout keys and modify/remove everything that can't be typed.)

Kanbun

  • Same deal with Enclosed CJK letters and months.

Kannada

  • I suspect the same deal as Bengali, et al.

Latin Extended-B

  • U+01D5 LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON and other letters in the block have sequences containing accented letters.
  • U+01E2 LATIN CAPITAL LETTER AE WITH MACRON, U+01E3, U+01FC, and U+01FD both need a standalone Æ key with no alternate sequence. Usable in Norwegian, but not on most keyboards.
  • U+01EE and U+01EF need a standalone ʒ key. (There's not even a sequence for the capital letter.)

Latin Extended-C

  • U+A765 LATIN SMALL LETTER THORN WITH STROKE is only used in Old English and Old Norse, while the only surviving language with a thorn key is Icelandic.

Malayalam

  • cf. Bengali

Mathematical Operators

  • U+2204 THERE DOES NOT EXIST has a sequence requiring ∃.
  • U+2209 NOT AN ELEMENT OF has a sequence requiring ∈.
  • U+220C DOES NOT CONTAIN AS A MEMBER has a sequence requiring ∋.
  • U+2224 DOES NOT DIVIDE and U+2226 NOT PARALLEL TO both require non-ASCII symbols with no alternate. |/ and ||/ are both free.
  • U+2241 NOT TILDE, U+2244 NOT ASYMPTOTICALLY EQUAL TO, U+2247 NEITHER APPROXIMATELY NOR ACTUALLY EQUAL TO, U+2249 NOT ALMOST EQUAL TO all require non-ASCII letters like a non-ASCII slash with no alternate.
  • 3 of 4 sequences for U+2262 NOT IDENTICAL TO require a non-ASCII symbol.
  • U+226D, U+2270, U+2271, U+2274, U+2275, U+2278, U+2279, U+2280, U+2281, U+2284, U+2285, U+2286, U+2287, U+2288, U+2289, U+2296, U+2299, U+22AC, U+22AD, U+22AE, U+22AF, U+22E0, U+22E1, U+22E2, U+22E3, U+22EA, U+22EB, U+22EC, U+22ED are all inaccessible for the same reason.

Miscellaneous technical

  • U+2336, 2338, 2339, 233A, 233C, 233D, 233E, 2341, 2342, 2343, 2344, 2345, 2346, 2347, 2348, 2349, 234A, 234B, 234C, 234D, 234E, 234F, 2350, 2351, 2352, 2353, 2354, 2355, 2356, 2357, 2359, 235A, 235B, 235C, 235D, 235E, 235F, 2360, 2361, 2362, 2363, 2364, 2365, 2366, 2367, 2368, 2369, 236B, 236F, 2370, 2371, 2372, 2376, 2377, 2378, 2379 are all APL symbols with no ASCII sequences. Consider removing, as APL is a dead language.

Musical symbols

  • Everything but the three clefs require symbols that are definitely not on keyboards. Remove, as this block is meant for musical score writing apps to use internally, not plaintext.

Myanmar, Oriya, Sinhala

  • cf. Bengali

Spacing modifier letters

  • U+02B1, 02B4, 02B5, 02B6, 02E0, 02E4 all require non-ASCII characters with no alternative.

Supplemental mathematical operators

  • U+2ADC FORKING requires non-ASCII characters with no alternate.

Supplemental punctuation

  • There are two duplicate entries for U+2E18 INVERTED INTERROBANG ← ?! for non-Spanish keyboards.

Tamil, Telugu, most of Tibetan

  • cf. Bengali

JapanYoshi avatar Sep 03 '21 12:09 JapanYoshi