chars icon indicating copy to clipboard operation
chars copied to clipboard

Difficulty searching for small triangles

Open Jayman2000 opened this issue 2 years ago • 2 comments

I can see that there are several small triangles that exist:

$ chars 'DOWN-POINTING TRIANGLE'
U+0001F783, 🞃 0x0001F783, \0373603, UTF-8: f0 9f 9e 83, UTF-16BE: d83ddf83
Width: 1, prints as 🞃
Quotes as \u{1f783}
Unicode name: BLACK DOWN-POINTING ISOSCELES RIGHT TRIANGLE

U+0001F53D, 🔽 0x0001F53D, \0372475, UTF-8: f0 9f 94 bd, UTF-16BE: d83ddd3d Width: 2, prints as 🔽 Quotes as \u{1f53d} Unicode name: DOWN-POINTING SMALL RED TRIANGLE

U+0001F53B, 🔻 0x0001F53B, \0372473, UTF-8: f0 9f 94 bb, UTF-16BE: d83ddd3b Width: 2, prints as 🔻 Quotes as \u{1f53b} Unicode name: DOWN-POINTING RED TRIANGLE

U+2BC6, ⯆ 0x2BC6, \025706, UTF-8: e2 af 86, UTF-16BE: 2bc6 Width: 1, prints as ⯆ Quotes as \u{2bc6} Unicode name: BLACK MEDIUM DOWN-POINTING TRIANGLE CENTRED

U+29E9, ⧩ 0x29E9, \024751, UTF-8: e2 a7 a9, UTF-16BE: 29e9 Width: 1, prints as ⧩ Quotes as \u{29e9} Unicode name: DOWN-POINTING TRIANGLE WITH RIGHT HALF BLACK

U+29E8, ⧨ 0x29E8, \024750, UTF-8: e2 a7 a8, UTF-16BE: 29e8 Width: 1, prints as ⧨ Quotes as \u{29e8} Unicode name: DOWN-POINTING TRIANGLE WITH LEFT HALF BLACK

U+26DB, ⛛ 0x26DB, \023333, UTF-8: e2 9b 9b, UTF-16BE: 26db Width: 1 (2 in CJK context), prints as ⛛ Quotes as \u{26db} Unicode name: HEAVY WHITE DOWN-POINTING TRIANGLE

U+25BF, ▿ 0x25BF, \022677, UTF-8: e2 96 bf, UTF-16BE: 25bf Width: 1, prints as ▿ Quotes as \u{25bf} Unicode name: WHITE DOWN-POINTING SMALL TRIANGLE

U+25BE, ▾ 0x25BE, \022676, UTF-8: e2 96 be, UTF-16BE: 25be Width: 1, prints as ▾ Quotes as \u{25be} Unicode name: BLACK DOWN-POINTING SMALL TRIANGLE

U+25BD, ▽ 0x25BD, \022675, UTF-8: e2 96 bd, UTF-16BE: 25bd Width: 1 (2 in CJK context), prints as ▽ Quotes as \u{25bd} Unicode name: WHITE DOWN-POINTING TRIANGLE

U+25BC, ▼ 0x25BC, \022674, UTF-8: e2 96 bc, UTF-16BE: 25bc Width: 1 (2 in CJK context), prints as ▼ Quotes as \u{25bc} Unicode name: BLACK DOWN-POINTING TRIANGLE

U+23F7, ⏷ 0x23F7, \021767, UTF-8: e2 8f b7, UTF-16BE: 23f7 Width: 1, prints as ⏷ Quotes as \u{23f7} Unicode name: BLACK MEDIUM DOWN-POINTING TRIANGLE

U+23EC, ⏬ 0x23EC, \021754, UTF-8: e2 8f ac, UTF-16BE: 23ec Width: 2, prints as ⏬ Quotes as \u{23ec} Unicode name: BLACK DOWN-POINTING DOUBLE TRIANGLE

$

But, when I try to look at only the small triangles:

$ chars 'SMALL TRIANGLE'
$ 

I get nothing. If I search for medium triangles:

$ chars 'MEDIUM TRIANGLE'
U+0001F827, 🠧 0x0001F827, \0374047, UTF-8: f0 9f a0 a7, UTF-16BE: d83edc27
Width: 1, prints as 🠧
Quotes as \u{1f827}
Unicode name: DOWNWARDS TRIANGLE-HEADED ARROW WITH MEDIUM SHAFT

U+0001F826, 🠦 0x0001F826, \0374046, UTF-8: f0 9f a0 a6, UTF-16BE: d83edc26 Width: 1, prints as 🠦 Quotes as \u{1f826} Unicode name: RIGHTWARDS TRIANGLE-HEADED ARROW WITH MEDIUM SHAFT

U+0001F825, 🠥 0x0001F825, \0374045, UTF-8: f0 9f a0 a5, UTF-16BE: d83edc25 Width: 1, prints as 🠥 Quotes as \u{1f825} Unicode name: UPWARDS TRIANGLE-HEADED ARROW WITH MEDIUM SHAFT

U+0001F824, 🠤 0x0001F824, \0374044, UTF-8: f0 9f a0 a4, UTF-16BE: d83edc24 Width: 1, prints as 🠤 Quotes as \u{1f824} Unicode name: LEFTWARDS TRIANGLE-HEADED ARROW WITH MEDIUM SHAFT

U+0001F807, 🠇 0x0001F807, \0374007, UTF-8: f0 9f a0 87, UTF-16BE: d83edc07 Width: 1, prints as 🠇 Quotes as \u{1f807} Unicode name: DOWNWARDS ARROW WITH MEDIUM TRIANGLE ARROWHEAD

U+0001F806, 🠆 0x0001F806, \0374006, UTF-8: f0 9f a0 86, UTF-16BE: d83edc06 Width: 1, prints as 🠆 Quotes as \u{1f806} Unicode name: RIGHTWARDS ARROW WITH MEDIUM TRIANGLE ARROWHEAD

U+0001F805, 🠅 0x0001F805, \0374005, UTF-8: f0 9f a0 85, UTF-16BE: d83edc05 Width: 1, prints as 🠅 Quotes as \u{1f805} Unicode name: UPWARDS ARROW WITH MEDIUM TRIANGLE ARROWHEAD

U+0001F804, 🠄 0x0001F804, \0374004, UTF-8: f0 9f a0 84, UTF-16BE: d83edc04 Width: 1, prints as 🠄 Quotes as \u{1f804} Unicode name: LEFTWARDS ARROW WITH MEDIUM TRIANGLE ARROWHEAD

U+2BC8, ⯈ 0x2BC8, \025710, UTF-8: e2 af 88, UTF-16BE: 2bc8 Width: 1, prints as ⯈ Quotes as \u{2bc8} Unicode name: BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED

U+2BC7, ⯇ 0x2BC7, \025707, UTF-8: e2 af 87, UTF-16BE: 2bc7 Width: 1, prints as ⯇ Quotes as \u{2bc7} Unicode name: BLACK MEDIUM LEFT-POINTING TRIANGLE CENTRED

U+2BC6, ⯆ 0x2BC6, \025706, UTF-8: e2 af 86, UTF-16BE: 2bc6 Width: 1, prints as ⯆ Quotes as \u{2bc6} Unicode name: BLACK MEDIUM DOWN-POINTING TRIANGLE CENTRED

U+2BC5, ⯅ 0x2BC5, \025705, UTF-8: e2 af 85, UTF-16BE: 2bc5 Width: 1, prints as ⯅ Quotes as \u{2bc5} Unicode name: BLACK MEDIUM UP-POINTING TRIANGLE CENTRED

U+23F7, ⏷ 0x23F7, \021767, UTF-8: e2 8f b7, UTF-16BE: 23f7 Width: 1, prints as ⏷ Quotes as \u{23f7} Unicode name: BLACK MEDIUM DOWN-POINTING TRIANGLE

U+23F6, ⏶ 0x23F6, \021766, UTF-8: e2 8f b6, UTF-16BE: 23f6 Width: 1, prints as ⏶ Quotes as \u{23f6} Unicode name: BLACK MEDIUM UP-POINTING TRIANGLE

U+23F5, ⏵ 0x23F5, \021765, UTF-8: e2 8f b5, UTF-16BE: 23f5 Width: 1, prints as ⏵ Quotes as \u{23f5} Unicode name: BLACK MEDIUM RIGHT-POINTING TRIANGLE

U+23F4, ⏴ 0x23F4, \021764, UTF-8: e2 8f b4, UTF-16BE: 23f4 Width: 1, prints as ⏴ Quotes as \u{23f4} Unicode name: BLACK MEDIUM LEFT-POINTING TRIANGLE

$

I still get plenty of results.

Jayman2000 avatar Jul 09 '22 17:07 Jayman2000

Huh, I suspect that the way we're querying the fst is wrong. Maybe there's a better way to make these queries, but I'm not sure at the moment.

antifuchs avatar Jul 10 '22 03:07 antifuchs

Something similar is happening when I search for “SIGN”:

$ chars HORNS
U+0001F918, 🤘 0x0001F918, \0374430, UTF-8: f0 9f a4 98, UTF-16BE: d83edd18
Width: 2, prints as 🤘
Quotes as \u{1f918}
Unicode name: SIGN OF THE HORNS

U+0001F608, 😈 0x0001F608, \0373010, UTF-8: f0 9f 98 88, UTF-16BE: d83dde08
Width: 2, prints as 😈
Quotes as \u{1f608}
Unicode name: SMILING FACE WITH HORNS

$ chars SIGN
$ 

Jayman2000 avatar Aug 19 '22 20:08 Jayman2000