mame icon indicating copy to clipboard operation
mame copied to clipboard

sdlmame using more than one font to increase the coverage?

Open belegdol opened this issue 3 years ago • 7 comments

Hello, sdlmame defaults to Liberation sans font which, among others, lacks some of the symbols needed for vgmplay. Inspired by Fedora 36 switching to Google Noto fonts I checked whether Google Noto Sans could provide the needed symbols. It does, but the symbols are part of Google Noto Symbols2:

$ fc-list ':charset=25a0 25eb 25b6 25c4 25ce' | grep noto
/usr/share/fonts/google-noto-vf/NotoSansMono-VF.ttf: Noto Sans Mono
/usr/share/fonts/google-noto/NotoSansSymbols2-Regular.ttf: Noto Sans Symbols2:style=Regular
/usr/share/fonts/google-noto-vf/NotoSansMono-VF.ttf: Noto Sans Mono:style=SemiBold
/usr/share/fonts/google-noto-vf/NotoSansMono-VF.ttf: Noto Sans Mono:style=Regular
/usr/share/fonts/google-noto-vf/NotoSansMono-VF.ttf: Noto Sans Mono:style=Medium
/usr/share/fonts/google-noto-vf/NotoSansMono-VF.ttf: Noto Sans Mono:style=Bold

Would it be possible to expand the font code so that it combines Noto Sans and Noto Sans Symbols2? The other option would be to switch to DejaVu font but as the distros are moving away from it, it does not seem like the best choice.

belegdol avatar May 08 '22 07:05 belegdol

Doing so would require quite a bit of work. The lack of a real font server is an issue on Linux. It pushes stuff like font substitution for missing characters into applications, and drives up memory consumption and startup time because every application needs logic for choosing appropriate fonts for fallback.

That aside, the whole OSD font module interface needs reworking to render text runs. It’s currently completely impossible for languages like Thai or Khmer (Cambodian) to render in MAME at all.

cuavas avatar May 08 '22 16:05 cuavas

Thanks for explaining. I don't suppose http://sdlpango.sourceforge.net/ is of any help? There have been some attempts to port it to SDL2 it seems: https://sourceforge.net/p/sdlpango/bugs/7/.

belegdol avatar May 08 '22 17:05 belegdol

While I do not know what is preventing Thai or Khmer text from being rendered, it appears that there has been some work in SDL2 regarding rendering ligatures and substitutions: https://github.com/libsdl-org/SDL_ttf/issues/62

belegdol avatar May 09 '22 17:05 belegdol

Given that a comprehensive solution is not on the horizon, would switching to DejaVu Sans be a possibility? I tried a bunch of fonts installed on my system and it was the only one apart from Adobe Source Code Pro which had all the symbols needed for vgmplay. Source Code Pro is shipped as OpenType on Fedora though so it does not work as a simple substitution.

belegdol avatar May 24 '24 20:05 belegdol

How widespread is DejaVu Sans in Linux distributions, and what’s its coverage like for Latin script languages? Various things in MAME no longer use plain ASCII for the main description.

FWIW, the best way to address issues with vgmplay specifically would probably be to embed SVG in the layout, now that we support that.

cuavas avatar May 25 '24 19:05 cuavas

Re: coverage, see below. Coverage of DejaVu Sans seems to be generally better. I used https://github.com/abelcheung/font-coverage to generate the data.


=== Liberation Sans

Basic Latin (U+0020-U+007F) => 95 / 95 / 0
Latin-1 Supplement (U+00A0-U+00FF) => 96 / 96 / 0
Latin Extended-A (U+0100-U+017F) => 128 / 128 / 0
Latin Extended-B (U+0180-U+024F) => 208 / 208 / 0
IPA Extensions (U+0250-U+02AF) => 96 / 96 / 0
Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 80 / 0
Combining Diacritical Marks (U+0300-U+036F) => 112 / 112 / 0
Greek and Coptic (U+0370-U+03FF) => 135 / 127 / 0
Cyrillic (U+0400-U+04FF) => 256 / 256 / 0
Cyrillic Supplement (U+0500-U+052F) => 48 / 24 / 0
Hebrew (U+0590-U+05FF) => 88 / 87 / 0
Phonetic Extensions (U+1D00-U+1D7F) => 128 / 128 / 0
Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 64 / 0
Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 13 / 0
Latin Extended Additional (U+1E00-U+1EFF) => 256 / 247 / 0
Greek Extended (U+1F00-U+1FFF) => 233 / 233 / 0
General Punctuation (U+2000-U+206F) => 111 / 57 / 0
Superscripts and Subscripts (U+2070-U+209F) => 42 / 22 / 0
Currency Symbols (U+20A0-U+20CF) => 33 / 23 / 0
Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 1 / 0
Letterlike Symbols (U+2100-U+214F) => 80 / 9 / 0
Number Forms (U+2150-U+218F) => 60 / 7 / 0
Arrows (U+2190-U+21FF) => 112 / 8 / 0
Mathematical Operators (U+2200-U+22FF) => 256 / 18 / 0
Miscellaneous Technical (U+2300-U+23FF) => 256 / 4 / 0
Box Drawing (U+2500-U+257F) => 128 / 40 / 0
Block Elements (U+2580-U+259F) => 32 / 8 / 0
Geometric Shapes (U+25A0-U+25FF) => 96 / 24 / 0
Miscellaneous Symbols (U+2600-U+26FF) => 256 / 21 / 0
Latin Extended-C (U+2C60-U+2C7F) => 32 / 21 / 0
Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 1 / 0
Modifier Tone Letters (U+A700-U+A71F) => 32 / 9 / 0
Latin Extended-D (U+A720-U+A7FF) => 193 / 7 / 0
Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 48 / 0
Combining Half Marks (U+FE20-U+FE2F) => 16 / 4 / 0
Specials (U+FFF0-U+FFFF) => 5 / 1 / 0
Unicode coverage = 2327 / 149476 = 1.56
=== DejaVu Sans

Basic Latin (U+0020-U+007F) => 95 / 95 / 0
Latin-1 Supplement (U+00A0-U+00FF) => 96 / 96 / 0
Latin Extended-A (U+0100-U+017F) => 128 / 128 / 0
Latin Extended-B (U+0180-U+024F) => 208 / 208 / 0
IPA Extensions (U+0250-U+02AF) => 96 / 96 / 0
Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 63 / 0
Combining Diacritical Marks (U+0300-U+036F) => 112 / 93 / 0
Greek and Coptic (U+0370-U+03FF) => 135 / 135 / 0
Cyrillic (U+0400-U+04FF) => 256 / 256 / 0
Cyrillic Supplement (U+0500-U+052F) => 48 / 38 / 0
Armenian (U+0530-U+058F) => 91 / 86 / 0
Hebrew (U+0590-U+05FF) => 88 / 54 / 0
Arabic (U+0600-U+06FF) => 256 / 165 / 0
NKo (U+07C0-U+07FF) => 62 / 54 / 0
Thai (U+0E00-U+0E7F) => 87 / 1 / 0
Lao (U+0E80-U+0EFF) => 83 / 65 / 0
Georgian (U+10A0-U+10FF) => 88 / 83 / 0
Unified Canadian Aboriginal Syllabics (U+1400-U+167F) => 640 / 404 / 0
Ogham (U+1680-U+169F) => 29 / 29 / 0
Phonetic Extensions (U+1D00-U+1D7F) => 128 / 106 / 0
Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 38 / 0
Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 6 / 0
Latin Extended Additional (U+1E00-U+1EFF) => 256 / 252 / 0
Greek Extended (U+1F00-U+1FFF) => 233 / 233 / 0
General Punctuation (U+2000-U+206F) => 111 / 107 / 0
Superscripts and Subscripts (U+2070-U+209F) => 42 / 42 / 0
Currency Symbols (U+20A0-U+20CF) => 33 / 26 / 0
Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 7 / 0
Letterlike Symbols (U+2100-U+214F) => 80 / 75 / 0
Number Forms (U+2150-U+218F) => 60 / 55 / 0
Arrows (U+2190-U+21FF) => 112 / 112 / 0
Mathematical Operators (U+2200-U+22FF) => 256 / 256 / 0
Miscellaneous Technical (U+2300-U+23FF) => 256 / 65 / 0
Control Pictures (U+2400-U+243F) => 39 / 2 / 0
Enclosed Alphanumerics (U+2460-U+24FF) => 160 / 10 / 0
Box Drawing (U+2500-U+257F) => 128 / 128 / 0
Block Elements (U+2580-U+259F) => 32 / 32 / 0
Geometric Shapes (U+25A0-U+25FF) => 96 / 96 / 0
Miscellaneous Symbols (U+2600-U+26FF) => 256 / 189 / 0
Dingbats (U+2700-U+27BF) => 192 / 174 / 0
Miscellaneous Mathematical Symbols-A (U+27C0-U+27EF) => 48 / 9 / 0
Supplemental Arrows-A (U+27F0-U+27FF) => 16 / 16 / 0
Braille Patterns (U+2800-U+28FF) => 256 / 256 / 0
Supplemental Arrows-B (U+2900-U+297F) => 128 / 6 / 0
Miscellaneous Mathematical Symbols-B (U+2980-U+29FF) => 128 / 13 / 0
Supplemental Mathematical Operators (U+2A00-U+2AFF) => 256 / 74 / 0
Miscellaneous Symbols and Arrows (U+2B00-U+2BFF) => 253 / 35 / 0
Latin Extended-C (U+2C60-U+2C7F) => 32 / 31 / 0
Georgian Supplement (U+2D00-U+2D2F) => 40 / 38 / 0
Tifinagh (U+2D30-U+2D7F) => 59 / 55 / 0
Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 7 / 0
Yijing Hexagram Symbols (U+4DC0-U+4DFF) => 64 / 64 / 0
Lisu (U+A4D0-U+A4FF) => 48 / 48 / 0
Cyrillic Extended-B (U+A640-U+A69F) => 96 / 33 / 0
Modifier Tone Letters (U+A700-U+A71F) => 32 / 20 / 0
Latin Extended-D (U+A720-U+A7FF) => 193 / 77 / 0
Private Use Area (U+E000-U+F8FF) => 0 / 0 / 96
Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 58 / 0
Arabic Presentation Forms-A (U+FB50-U+FDFF) => 631 / 108 / 0
Variation Selectors (U+FE00-U+FE0F) => 16 / 16 / 0
Combining Half Marks (U+FE20-U+FE2F) => 16 / 4 / 0
Arabic Presentation Forms-B (U+FE70-U+FEFF) => 141 / 141 / 0
Specials (U+FFF0-U+FFFF) => 5 / 5 / 0
Old Italic (U+10300-U+1032F) => 39 / 35 / 0
Tai Xuan Jing Symbols (U+1D300-U+1D35F) => 87 / 87 / 0
Mathematical Alphanumeric Symbols (U+1D400-U+1D7FF) => 996 / 117 / 0
Arabic Mathematical Alphabetic Symbols (U+1EE00-U+1EEFF) => 143 / 74 / 0
Domino Tiles (U+1F030-U+1F09F) => 100 / 100 / 0
Playing Cards (U+1F0A0-U+1F0FF) => 82 / 59 / 0
Miscellaneous Symbols and Pictographs (U+1F300-U+1F5FF) => 768 / 12 / 0
Emoticons (U+1F600-U+1F64F) => 80 / 64 / 0
Unicode coverage = 5822 / 149476 = 3.89
$ diff -u liberation.txt dejavu.txt 
--- liberation.txt	2024-05-26 13:25:48.184324659 +0200
+++ dejavu.txt	2024-05-26 13:25:26.328369116 +0200
@@ -1,40 +1,75 @@
 
-=== Liberation Sans
+=== DejaVu Sans
 
 Basic Latin (U+0020-U+007F) => 95 / 95 / 0
 Latin-1 Supplement (U+00A0-U+00FF) => 96 / 96 / 0
 Latin Extended-A (U+0100-U+017F) => 128 / 128 / 0
 Latin Extended-B (U+0180-U+024F) => 208 / 208 / 0
 IPA Extensions (U+0250-U+02AF) => 96 / 96 / 0
-Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 80 / 0
-Combining Diacritical Marks (U+0300-U+036F) => 112 / 112 / 0
-Greek and Coptic (U+0370-U+03FF) => 135 / 127 / 0
+Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 63 / 0
+Combining Diacritical Marks (U+0300-U+036F) => 112 / 93 / 0
+Greek and Coptic (U+0370-U+03FF) => 135 / 135 / 0
 Cyrillic (U+0400-U+04FF) => 256 / 256 / 0
-Cyrillic Supplement (U+0500-U+052F) => 48 / 24 / 0
-Hebrew (U+0590-U+05FF) => 88 / 87 / 0
-Phonetic Extensions (U+1D00-U+1D7F) => 128 / 128 / 0
-Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 64 / 0
-Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 13 / 0
-Latin Extended Additional (U+1E00-U+1EFF) => 256 / 247 / 0
+Cyrillic Supplement (U+0500-U+052F) => 48 / 38 / 0
+Armenian (U+0530-U+058F) => 91 / 86 / 0
+Hebrew (U+0590-U+05FF) => 88 / 54 / 0
+Arabic (U+0600-U+06FF) => 256 / 165 / 0
+NKo (U+07C0-U+07FF) => 62 / 54 / 0
+Thai (U+0E00-U+0E7F) => 87 / 1 / 0
+Lao (U+0E80-U+0EFF) => 83 / 65 / 0
+Georgian (U+10A0-U+10FF) => 88 / 83 / 0
+Unified Canadian Aboriginal Syllabics (U+1400-U+167F) => 640 / 404 / 0
+Ogham (U+1680-U+169F) => 29 / 29 / 0
+Phonetic Extensions (U+1D00-U+1D7F) => 128 / 106 / 0
+Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 38 / 0
+Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 6 / 0
+Latin Extended Additional (U+1E00-U+1EFF) => 256 / 252 / 0
 Greek Extended (U+1F00-U+1FFF) => 233 / 233 / 0
-General Punctuation (U+2000-U+206F) => 111 / 57 / 0
-Superscripts and Subscripts (U+2070-U+209F) => 42 / 22 / 0
-Currency Symbols (U+20A0-U+20CF) => 33 / 23 / 0
-Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 1 / 0
-Letterlike Symbols (U+2100-U+214F) => 80 / 9 / 0
-Number Forms (U+2150-U+218F) => 60 / 7 / 0
-Arrows (U+2190-U+21FF) => 112 / 8 / 0
-Mathematical Operators (U+2200-U+22FF) => 256 / 18 / 0
-Miscellaneous Technical (U+2300-U+23FF) => 256 / 4 / 0
-Box Drawing (U+2500-U+257F) => 128 / 40 / 0
-Block Elements (U+2580-U+259F) => 32 / 8 / 0
-Geometric Shapes (U+25A0-U+25FF) => 96 / 24 / 0
-Miscellaneous Symbols (U+2600-U+26FF) => 256 / 21 / 0
-Latin Extended-C (U+2C60-U+2C7F) => 32 / 21 / 0
-Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 1 / 0
-Modifier Tone Letters (U+A700-U+A71F) => 32 / 9 / 0
-Latin Extended-D (U+A720-U+A7FF) => 193 / 7 / 0
-Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 48 / 0
+General Punctuation (U+2000-U+206F) => 111 / 107 / 0
+Superscripts and Subscripts (U+2070-U+209F) => 42 / 42 / 0
+Currency Symbols (U+20A0-U+20CF) => 33 / 26 / 0
+Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 7 / 0
+Letterlike Symbols (U+2100-U+214F) => 80 / 75 / 0
+Number Forms (U+2150-U+218F) => 60 / 55 / 0
+Arrows (U+2190-U+21FF) => 112 / 112 / 0
+Mathematical Operators (U+2200-U+22FF) => 256 / 256 / 0
+Miscellaneous Technical (U+2300-U+23FF) => 256 / 65 / 0
+Control Pictures (U+2400-U+243F) => 39 / 2 / 0
+Enclosed Alphanumerics (U+2460-U+24FF) => 160 / 10 / 0
+Box Drawing (U+2500-U+257F) => 128 / 128 / 0
+Block Elements (U+2580-U+259F) => 32 / 32 / 0
+Geometric Shapes (U+25A0-U+25FF) => 96 / 96 / 0
+Miscellaneous Symbols (U+2600-U+26FF) => 256 / 189 / 0
+Dingbats (U+2700-U+27BF) => 192 / 174 / 0
+Miscellaneous Mathematical Symbols-A (U+27C0-U+27EF) => 48 / 9 / 0
+Supplemental Arrows-A (U+27F0-U+27FF) => 16 / 16 / 0
+Braille Patterns (U+2800-U+28FF) => 256 / 256 / 0
+Supplemental Arrows-B (U+2900-U+297F) => 128 / 6 / 0
+Miscellaneous Mathematical Symbols-B (U+2980-U+29FF) => 128 / 13 / 0
+Supplemental Mathematical Operators (U+2A00-U+2AFF) => 256 / 74 / 0
+Miscellaneous Symbols and Arrows (U+2B00-U+2BFF) => 253 / 35 / 0
+Latin Extended-C (U+2C60-U+2C7F) => 32 / 31 / 0
+Georgian Supplement (U+2D00-U+2D2F) => 40 / 38 / 0
+Tifinagh (U+2D30-U+2D7F) => 59 / 55 / 0
+Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 7 / 0
+Yijing Hexagram Symbols (U+4DC0-U+4DFF) => 64 / 64 / 0
+Lisu (U+A4D0-U+A4FF) => 48 / 48 / 0
+Cyrillic Extended-B (U+A640-U+A69F) => 96 / 33 / 0
+Modifier Tone Letters (U+A700-U+A71F) => 32 / 20 / 0
+Latin Extended-D (U+A720-U+A7FF) => 193 / 77 / 0
+Private Use Area (U+E000-U+F8FF) => 0 / 0 / 96
+Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 58 / 0
+Arabic Presentation Forms-A (U+FB50-U+FDFF) => 631 / 108 / 0
+Variation Selectors (U+FE00-U+FE0F) => 16 / 16 / 0
 Combining Half Marks (U+FE20-U+FE2F) => 16 / 4 / 0
-Specials (U+FFF0-U+FFFF) => 5 / 1 / 0
-Unicode coverage = 2327 / 149476 = 1.56
+Arabic Presentation Forms-B (U+FE70-U+FEFF) => 141 / 141 / 0
+Specials (U+FFF0-U+FFFF) => 5 / 5 / 0
+Old Italic (U+10300-U+1032F) => 39 / 35 / 0
+Tai Xuan Jing Symbols (U+1D300-U+1D35F) => 87 / 87 / 0
+Mathematical Alphanumeric Symbols (U+1D400-U+1D7FF) => 996 / 117 / 0
+Arabic Mathematical Alphabetic Symbols (U+1EE00-U+1EEFF) => 143 / 74 / 0
+Domino Tiles (U+1F030-U+1F09F) => 100 / 100 / 0
+Playing Cards (U+1F0A0-U+1F0FF) => 82 / 59 / 0
+Miscellaneous Symbols and Pictographs (U+1F300-U+1F5FF) => 768 / 12 / 0
+Emoticons (U+1F600-U+1F64F) => 80 / 64 / 0
+Unicode coverage = 5822 / 149476 = 3.89

belegdol avatar May 26 '24 11:05 belegdol

Re:availability, it does not seem too bad either:

  1. https://dejavu-fonts.github.io/Download.html (Third-party packages section)
  2. https://archlinux.org/packages/extra/any/ttf-dejavu/
  3. https://packages.manjaro.org/?query=dejavu

belegdol avatar May 26 '24 11:05 belegdol

126b4036e3433ee5d1c48a98d2feb8492b6d4cbc addresses the issue of vgmplay using weird characters for the playback controls: image

cuavas avatar Dec 22 '24 16:12 cuavas