go-runewidth icon indicating copy to clipboard operation
go-runewidth copied to clipboard

Define width?

Open ghostsquad opened this issue 4 years ago â€ĸ 7 comments

This is a question about how you are defining "width"? I'm mostly looking for a solution that gives me character width in monospaced fonts. So example in #39 and #36, the "width" would still be 2 as a flag although is considered 1 character in modern renders, it still takes up the space of 2 normal characters.

ghostsquad avatar Jun 29 '20 04:06 ghostsquad

@ghostsquad rune has a clear definition in the Go specification: an integer value identifying a Unicode code point.

The doc for RuneWidth gives another hint: it points to https://www.unicode.org/reports/tr11/ which talks about cells.

Instead flag emojis are made of 2 runes/codepoints.

So this package is more about East Asian characters, not emojis.

dolmen avatar Jul 08 '20 15:07 dolmen

@ghostsquad uniseg.GraphemeClusterCount might interest you: it will tell you how multiple runes combine for a single grapheme. But that's not a complete solution to you problem (I suppose rendering in a terminal emulator): it will not tell you how much space is used to render that grapheme in a monospace font (especially as "monospace font" and "modern renders" are fuzzy).

dolmen avatar Jul 11 '20 06:07 dolmen

@dolmen There is already plan to use it.

See https://github.com/mattn/go-runewidth/pull/29

mattn avatar Jul 11 '20 06:07 mattn

@dolmen yep I already looked at uniseg, and it doesn't provide the right information

ghostsquad avatar Jul 11 '20 18:07 ghostsquad

You can kinda see some of the problems I'm trying to solve... it seems not even all monospaced fonts are made equally. From the github code view, you can see that the right padding misaligns the text. But from the screenshot (of my terminal, using Fira Mono for Powerline), the right padding is needed.

❯ ./test
     rune width: 2
     rune count: 1
            len: 4
    grapheme ct: 1
   req left pad: 3
  req right pad: 0
[  🔄 AAA]
     rune width: 2
     rune count: 2
            len: 8
    grapheme ct: 1
   req left pad: 4
  req right pad: 1
[  🇧🇾  BBB]
     rune width: 2
     rune count: 2
            len: 6
    grapheme ct: 1
   req left pad: 4
  req right pad: 1
[  ℹī¸  CCC]
     rune width: 1
     rune count: 1
            len: 3
    grapheme ct: 1
   req left pad: 4
  req right pad: 0
[   â€ĸ DDD]

[  🔄 AAA]
[  🇧🇾  BBB]
[  ℹī¸  CCC]
[   â€ĸ DDD]
image
package main

import (
	"fmt"
	"unicode/utf8"

	"github.com/mattn/go-runewidth"
	"github.com/rivo/uniseg"
)

func main() {
	fmt.Printf("%15s: %d\n", "rune width", runewidth.StringWidth("🔄"))
	fmt.Printf("%15s: %d\n", "rune count", utf8.RuneCountInString("🔄"))
	fmt.Printf("%15s: %d\n", "len", len("🔄"))
	fmt.Printf("%15s: %d\n", "grapheme ct", uniseg.GraphemeClusterCount("🔄"))
	fmt.Printf("%15s: %d\n", "req left pad", 3)
	fmt.Printf("%15s: %d\n", "req right pad", 0)
	fmt.Printf("[%*s", 3, "🔄")
	fmt.Printf(" AAA]\n")

	fmt.Printf("%15s: %d\n", "rune width", runewidth.StringWidth("🇧🇾"))
	fmt.Printf("%15s: %d\n", "rune count", utf8.RuneCountInString("🇧🇾"))
	fmt.Printf("%15s: %d\n", "len", len("🇧🇾"))
	fmt.Printf("%15s: %d\n", "grapheme ct", uniseg.GraphemeClusterCount("🇧🇾"))
	fmt.Printf("%15s: %d\n", "req left pad", 4)
	fmt.Printf("%15s: %d\n", "req right pad", 1)
	fmt.Printf("[%*s", 4, "🇧🇾")
	fmt.Printf("  BBB]\n")

	fmt.Printf("%15s: %d\n", "rune width", runewidth.StringWidth("ℹī¸"))
	fmt.Printf("%15s: %d\n", "rune count", utf8.RuneCountInString("ℹī¸"))
	fmt.Printf("%15s: %d\n", "len", len("ℹī¸"))
	fmt.Printf("%15s: %d\n", "grapheme ct", uniseg.GraphemeClusterCount("ℹī¸"))
	fmt.Printf("%15s: %d\n", "req left pad", 4)
	fmt.Printf("%15s: %d\n", "req right pad", 1)
	fmt.Printf("[%*s", 4, "ℹī¸")
	fmt.Printf("  CCC]\n")

	fmt.Printf("%15s: %d\n", "rune width", runewidth.StringWidth("â€ĸ"))
	fmt.Printf("%15s: %d\n", "rune count", utf8.RuneCountInString("â€ĸ"))
	fmt.Printf("%15s: %d\n", "len", len("â€ĸ"))
	fmt.Printf("%15s: %d\n", "grapheme ct", uniseg.GraphemeClusterCount("â€ĸ"))
	fmt.Printf("%15s: %d\n", "req left pad", 4)
	fmt.Printf("%15s: %d\n", "req right pad", 0)
	fmt.Printf("[%*s", 4, "â€ĸ")
	fmt.Printf(" DDD]\n")

	fmt.Println()

	fmt.Printf("[%*s AAA]\n", 3, "🔄")
	fmt.Printf("[%*s  BBB]\n", 4, "🇧🇾")
	fmt.Printf("[%*s  CCC]\n", 4, "ℹī¸")
	fmt.Printf("[%*s DDD]\n", 4, "â€ĸ")
}

ghostsquad avatar Aug 11 '20 06:08 ghostsquad

well, I might have landed on something interesting...

package main

import (
	"fmt"
	"strings"
	// "unicode/utf8"

	"github.com/mattn/go-runewidth"
)

// aligns to 5 characters
func valuePaddingPredictor(val string) string {
	runeWidth := runewidth.StringWidth(val)
	// runeCount := utf8.RuneCountInString(val)
	stringLen := len(val)

	leftPad := 3
	rightPad := 1
	if runeWidth == 1 {
		leftPad++
	}

	if stringLen > 4 {
		leftPad++
		rightPad++
	}

	return fmt.Sprintf("[%*s%sAAA]", leftPad, val, strings.Repeat(" ", rightPad))
}

func main() {
	characters := []string{
		"🔄",
		"🇧🇾",
		"ℹī¸",
		"💩",
		"x",
		"😀",
		"💚",
		"☁ī¸",
		"â€ĸ",
		"⨯",
		"✔ī¸",
		"✓",
		"؏",
		"├",
		"âģ¨",
	}

	for _, c := range characters {
		fmt.Println(valuePaddingPredictor(c))
	}
}
[  🔄 AAA]
[  🇧🇾  AAA]
[  ℹī¸  AAA]
[  💩 AAA]
[   x AAA]
[  😀 AAA]
[  💚 AAA]
[  ☁ī¸  AAA]
[   â€ĸ AAA]
[   ⨯ AAA]
[  ✔ī¸  AAA]
[   ✓ AAA]
[   ؏ AAA]
[   ├ AAA]
[  âģ¨ AAA]
image

this is probably good enough for what I need.

ghostsquad avatar Aug 11 '20 07:08 ghostsquad

Hello,

I maintain the python wcwidth library, and I recently wrote a specification that is of interest to this specific issue. I also wrote an automatic testing tool to asses any individual terminal emulator's compliance to the specification for Wide, Zero, ZWJ, and Emoji VS-16 character sequences.

I wrote an overview here https://www.jeffquast.com/post/ucs-detect-test-results/

I just want to point out, most especially, the automatic test results for 20+ terminals, that indeed you will find varying levels of unicode version and feature support across terminals, so it is important to keep that in mind when trying to validate.

jquast avatar Dec 17 '23 16:12 jquast