nerd-fonts
nerd-fonts copied to clipboard
[Suggestion] Fix invalid code points for some glyphs
Summary
The current builds overwrites some code points that Unicode Consorthium prohibits to use for custom glyphs. I want to change this by fixing font_patcher.
Problem Detail
Unicode defines Private Use Areas and the consorthium itself decide not to add characters on these areas. So we can add glyphs as we like.
The areas has these code points: U+E000..U+F8FF, U+F0000..U+FFFFD, U+100000..U+10FFFD.
But the Nerd Fonts is overwriting more code points than them. The font_patcher writes glyphs as below.
- source: the ones from source font files.
- current: the ones font_patcher writes into.
Font Name | source | current | plan 1 | plan 2 |
---|---|---|---|---|
Seti-UI + Custom | E4FA-E52E | E5FA-E62E | ← | ← |
Devicons | E600-E6C5 | E700-E7C5 | E630-E6F6 | E700-E7C5 (not changed) |
Powerline Symbols | E0A0-E0B3 | ← | ← | ← |
Powerline Extra Symbols | E0A3-E0D4 | ← | ← | ← |
Pomicons | E000-E00A | ← | ← | ← |
Font Awesome | F000-F2E0 | ← | ← | ← |
Font Awesome Extension | E000-E0A9 | E200-E2A9 | ← | ← |
Power Symbols | 23FB-2B58 | ← | ← | ← |
Material | F001-F847 | F500-FD46 | E700-EF47 | F500-F8FF,E800-EC47 |
Weather Icons | F000-F0EB | E300-E3EB | ← | ← |
Font Logos (Font Linux) | F100-F11C | F300-F31C | ← | ← |
Octicons | F000-F105 | F400-F505 | ← | ← |
Octicons | 2665-2665 | ← | ← | ← |
Octicons | 26A1-26A1 | ← | ← | ← |
Octicons | F27C-F27C | F4A9-F4A9 | ← | ← |
It is the problem that font_patcher writes glyphs from Material into the range: U+F500 - U+FD46. This range overlaps areas that should not use for such purpose.
area | name |
---|---|
U+F900..U+FAFF | CJK Compatibility Ideographs |
U+FB00..U+FB4F | Alphabetic Presentation Forms |
U+FB50..U+FDFF | Arabic Presentation Forms-A |
Suggestion
So I suggest two plans to solve this.
plan 1
- Move Devicons just after Seti-UI + Custom.
- Move Material to U+E700..U+EF47.
- Pros - Material will be still in a cluster.
- Cons - Two glyph sets will be moved.
plan 2
- Move and separate Material into U+F500..U+F8FF, U+E800-U+EC47.
- Pros - Only one set will be moved.
- Pros - The former part of Material still has the same code points.
- Cons - Material will be separated into two clusters.
I prefer plan 2 because it has less impact on the current builds. How do you think?
Thanks a lot @delphinus . I appreciate the thought you put into this.
To be completely honest I wasn't careful enough to be sure the ranges remained within the Private Use Areas. We should definitely make sure to stay within the PUAs going forward :blush:
I think I like plan 2 or some variation of it as well. I would prefer plan 1 for something that wouldn't be a major release (e.g. 2.x.x
).
We need to decide if we are following semver strictly or not, if strictly then this would technically be a breaking change either way and would require us to version as 3.x
but at the same time it would "feel" wrong to bump to version 3 if this was the only major change.
Good plans and suggestions here and I am mostly in agreement with you. I am wondering about versioning and what exactly the ranges should/could be :smiley:
Thanks for agreement. The changes will be a breaking change indeed. But for releasing v3.0.0
, the “CHANGES” might be so less than the ones in v2.0.0
. 🤔
Yeah, your recommendation makes complete sense and obvious some changes need to happen.
But for releasing v3.0.0, the “CHANGES” might be so less than the ones in v2.0.0. thinking
Sorry, I didn't quite follow, can you elaborate?
I see there are a lot of changes in v1.1.0 → v2.0.0. But if you use v3.0.0 this time, v2.0.0 → v3.0.0 will have 1 diff (this issue) only. I was just curious. ;)
I get you now. I am thinking 3.0
would have this change and many others: update to material design, and other icon additions and programming fonts. I would like to do a 2.1
release soon, next one might be a big that 3.0
major.
Can you provide an update regarding this issue? That the Material Design set overrides non-PUA codepoints is a significant issue.
@aaronbell Unfortunately the update I am going to give is not what you want to hear: There really are no updates on any forward momentum on moving the Material Design codepoints. However, I am back at trying to get a release out, after that this is likely one of the top priorities. This will be a breaking change as far as Nerd Fonts goes so it would be a 3.0
release. I agree it is a significant issue.
Hope that helps.
@ryanoasis Thanks Ryan. Too bad. Unfortunately, I will not be adding full Nerd Fonts support to Cascadia Code until v3.0 is released as I am unwilling to include Material Design icons in their current location. International interoperability takes precedent.
@aaronbell I completely understand that position. Absolutely international support takes precedent as it should. The whole code points seeping outside of PUA is a big mistake on my part.
While there has been no momentum of the fix in terms of code I did start to group tasks under a new 3.0
milestone and there definitely is a pressure to make it right.
Thanks for your valuable input and straightforwardness.
How difficult is it to move glyphs to another location? I guess the font patcher needs to be adjusted?
https://github.com/ryanoasis/nerd-fonts/issues/365#issuecomment-519779578 (@ryanoasis)
I think I like plan 2 or some variation of it as well. I would prefer plan 1 for something that wouldn't be a major release (e.g. 2.x.x).
I believe changing the range two times is not something anybody wants. (If I read that correctly.)
Maybe the way how this shall be rolled out needs to be formally fixed. The changes themselves are trivial.
I think there are additional possible plans:
Plan 1
Devicons move to E630
- E6F6
Material move to E700-EF47
Codicons needs to vacate EA60
- EBEB
Pro: Material in one block Con: Material displaces Codicons (will they be useful with VS?) Con: No space for future expansion of Seti + Custom Con: No smooth transition a la Plan Plus possible
Plan 2
Material split and move to F500
- F8FF
and E800
- EC47
Codicons needs to vacate EA60
- EBEB
Pro: A lot Material codepoints unchanged Con: Material displaces Codicons (will they be useful with VS?) Con: Material is split (is that really an issue?)
Plan 3
Material split and move to F500
- F8FF
and E900
- ED47
Codicons needs to vacate EA60
- EBEB
Pro: A lot Material codepoints unchanged, only one digit changes in moved codepoints Con: Material displaces Codicons (will they be useful with VS?) Con: Material is split (is that really an issue?)
Plan 4
Material move to FF500
- FFD46
Pro: Material in one block, only one digit changes in moved codepoints
Con: Use of codepoints above FFFF
(is that really an issue?)
Plan 5
Material move to F0001
- ...
Pro: Material in one block, on new original codepoints
Pro: Lots of space for Material expansion
Con: Use of codepoints above FFFF
(is that really an issue?)
Additional Plan Plus
No matter which plan is decided on, I believe we should act on it now, and not wait until a next release. Specifically it would be beneficial if the NEW destination codepoints are filled additionally with the glyphs already now, so people have a change to adjust their setups. Not only in release 2.2.0 or even worse 3.0.0.
In a second step, after a (major, see semver) release, the obsolete codepoints can be dropped.
That would result in a more smooth transition path. It also means that (at least part of) Material exists two times in the patched fonts.
Edit: Mention overlooked Codicons and add Plan 5
@wismill https://github.com/ryanoasis/nerd-fonts/pull/609#issuecomment-978943902 also mentions Plan Plus.
@Finii
It also means that (at least part of) Material exists two times in the patched fonts.
If we're okay with material existing twice over, we could take the newest version of Material Design Icons, and put them where their current codepoints have put them (0F0001 - 0F19C3). This will have the benefit of being easy to update (they're adding icons very frequently), and being a continuous block, and of course, having all our largest icon-packs in their canonical locations (I haven't done a history check, but thinking that @Templarian is not likely to do the code point shifting thing again).
I just noticed, thanks to @delphinus , that the original table in the top is outdated. Thanks for the updated table in https://github.com/delphinus/homebrew-sfmono-square/issues/67
After we included Codicons (#705) (0xEA60
- 0xEBEB
) Plan 1 & 2 & 3 became impossible (or at least... we would need to move Codicons first :unamused:)
I will update my Plan List above accordingly.
With this maybe @earboxer's idea https://github.com/ryanoasis/nerd-fonts/pull/772#issuecomment-1023173171 for 0xF0001
gets rather interesting. I will list this as Plan 5 above.
Edit: Before only Plan 2 and 3 were mentioned, but 1, 2, and 3 are affected!
Plan 4 Pro: Material in one block, only one digit changes in moved codepoints
Apple SF Symbols has occupied the first 3,300+ codepoints in Plane 16 (Supplementary Private Use Area B, U+100000
-U+10FFFF
). So if Plan 4 was chosen, we will need another block soon when the Material Icons exhausted Plane 15 (U+FFFFF
). Plan 5 looks more future-proof.
So... it has been two years since my last inquiry on this. Is there a finalized decision for the location of Material Design icons that don't override other unicode slots?
@aaronbell Unfortunately .. no. Ryan's comments above were the last time he has been seen here, regrettably. Recently I started to push on with releases: Ryan's initialted 2.2.0
and at the moment 2.3.0
which is intended to update a lot of the source symbols and maybe fonts. Ryan envisioned the codepoint change to 3.0.0
, breaking as it is.
TL;DR:
I believe most arguments point to Material at its original location, which is F0001
- F1AF0
(currently).
- This is the original location
- It does not interfere with any existing codepoints
- This is (begin of) PUA-A
-=> Plan 5 a.k.a #773
Secretly I wonder ... is it really worth to add another 7,000 glyphs? Where will it end? ;-)
If you think this is the way to go, I will do it here with that points.
Well, excuse me,
Unrelated, maybe I can ask you @aaronbell for some information?
People over at Fontforge (once) believed that Windows has an additional length limit on the font FamilyName (writeup by me here). But I can not find that anywhere on Microsoft's Typography websites, and additionally I use fonts with longer names on Windows (10) with no problems. Is that really an issue, or was that like a Windows 3.1 problem? Or Windows 7, or MS-Word 5 with a too-narrow font-name pulldown?
Below the line (here) is just data I collected to come to the conclusion:
But, as @earboxer suggested, for a smoother codepoint transition we (Nerd Font) could introduce the 'new codepoints' additionally to the current ones, so that 2.3.0
contains old and new codepoints. For that reason it is exactly the right moment now to decide on that. I would expect 2.3.0
in October.
Please let me (again) tabularize data:
Glyph set original location now and after updating (sorted by current dest)
Glyph set | current start | current length | update start | update length | update codepoint stable? | comment | current destination | source |
---|---|---|---|---|---|---|---|---|
Pomicons | E000 |
10 | - | no update, [4] | E000 |
https://github.com/gabrielelana/pomicons | ||
Font Awesome Ext | E000 |
170 | - | no update, [3] | E200 |
https://github.com/AndreLZGava/font-awesome-extension | ||
Weather | F000 |
236 | F000 |
222 | probably | E300 |
https://github.com/erikflowers/weather-icons | |
Seti + our | E4FA |
59 | E4FA |
~175 | yes | [0] | E5FA |
https://github.com/jesseweed/seti-ui |
Devicons | E600 |
198 | E600 |
~500 | no | [1] | E700 |
https://github.com/devicons/devicon |
Codicons | EA60 |
396 | EA60 |
430 | yes | EA60 |
https://github.com/microsoft/vscode-codicons | |
Font Awesome | F000 |
737 | ? | 213 + 515 ? | no | [2] | F000 |
https://github.com/FortAwesome/Font-Awesome |
Font Logos | F300 |
48 | - | - | yes | already updated | F300 |
https://github.com/Lukas-W/font-logos |
Octicons | F000 |
262 | ? | 515 | no | F400 |
https://github.com/primer/octicons | |
Material | F001 |
2119 | F0001 |
6896 | unknown | F500 |
https://github.com/Templarian/MaterialDesign-Font |
[0] Codepoints allocated by us [1] Update scattered and lots of icons unusable for fonts [2] Current release scattered and split into multiple font files [3] Maybe obsolete, check extension glyphs for duplicates [4] Maybe obsolete? Codepoints clash sometimes with original font's ligatures etc
Edit: Add link to PR
@Finii Ah well, that explains that.
Secretly I wonder ... is it really worth to add another 7,000 glyphs? Where will it end? ;-)
Might want to ask Unicode about their decision to include Emoji :). I fear there will always be new icons to add or symbols that people want. All you can do is go along with it, or say, "NO MORE!"
People over at Fontforge (once) believed that Windows has an additional length limit on the font FamilyName.
I don't think there's any documentation on it on the Microsoft typography website (it isn't really spec related). There's some info / investigation here worth reading: https://github.com/googlefonts/fontbakery/issues/2179 tldr: It is primarily a legacy issue in certain applications and situations, but appears to crop up in unexpected places, so the recommendation is to keep less than 29 characters long to avoid any problems.
2.3.0
Wow! It looks like there are a lot of new glyphs being added across the full Nerd Fonts set. I expect that if these glyph sets continue to expand, it'll make things challenging to keep them separate. Not to mention needing to organize and create a 'master' version of the codepoints. Phew!
If I understand the chart that you've included, the current plan (plan 5) is to essentially move Material design to F0001
where it can expand freely as necessary. It sounds like you're also planning to preserve the existing location for the time being until the breaking 3.0.0 change.
Anyway, that plan works for me. I've been circling back to investigating native support for Nerd Fonts again (finally), and having a Unicode-compliant solution is great.
I look forward to the finalized locations for everything! Let me know if I can be of assistance.
@aaronbell
Thank you for the information on name length. This is very much appreciated.
2.3.0
Plan 5 is essentially adding the current Material Design Icons at their native codepoints (i.e. F0001
- F1AF0
).
The current / old Material Design Icons (renamed to 'legacy') are kept in the problematic regions, and be removed with 3.0.0.
This shall make the transition of codepoints easier for the users.
I worked on the relevant PR #773 today - there is a small scale-translate problem I'd like to solve before merging.
(I.e. .you understood that perfectly right.)
a lot of new glyphs
Material is exploding, but now has room to 'do its thing'. I'm not really sure it does make sense to add it at all. Today's terminal emulators often do a good job with glyph rescaling; and putting the Material Design Desktop font somewhere and rely on font fallback should be a good solution for most people.
From the other sets it is only Octicons and Devicons that really grow. Both without stable codepoints over updates :unamused: But I seem to have started a codepoint discussion with Devicons; their web-user centric view has to be expanded ;)
@aaronbell
tldr: It is primarily a legacy issue in certain applications and situations, but appears to crop up in unexpected places, so the recommendation is to keep less than 29 characters long to avoid any problems.
I'm so relieved, that Cascadia Code
does also violate that and not only us ;-}
For example CascadiaCodePL-ExtraLightItalic.ttf
(2111.01) has 34 chars in ID.4.
And while we have code to limit the length of ID.1 and ID.2 (albeit half broken), we do not use the same abbreviations in ID.4 and almost all fonts (also the 'Windows Compatible' ones) have very long full-names. Noone ever complained, so maybe we can ignore the MS-Word-2011
s and IE9
s out there. :grimacing:
(But sorry this should not be discussed in this issue.)
Related #813
@glepnir this is the nerd-font bug that causes https://github.com/kovidgoyal/kitty/issues/5415
@glepnir this is the nerd-font bug that causes kovidgoyal/kitty#5415
@ppwwyyxx Yes. We try to correct that long standing issue (place symbols where Chinese and other glyphs should be) soon (everything has been prepared for the fix already). Sorry that it arose at all.
But maybe a question about kitty
. From what @kovidgoyal writes it seems kitty
has a workaround that replaces the (erroneous) symbols with the correct Chinese glyphs?
If so, where do the glyphs come from? How does kitty
decide if that is a legitimate Chinese font with Nerd Font symbols patched in (a future version that leaves the Chinese glyphs intact) and uses the font-encoded glyphs, or use some other fallback font/glyph? Or is this a setting?
Often kitty
users also raise Issues here, so I would like to understand that part of kitty
a bit better.
Thank you :-)
In kitty glyphs come from fonts, which font is chosen depends on the system font libraries (fontconfig/CoreText). First the main font specified for kitty is tried, if that does not have the glyph, then the system is queried for a fallback. This is the same as in most applications.
However, kitty has a feature called symbol_map which allows users to instruct it to load glyphs for the specified code point from a particular font. Many kittys users use this feature to work with NERD font symbols.
@kovidgoyal Wow, thank you for the instantaneous answer!
Again let me apologize that we introduce this issue for a lot of users at all.
I hoped to get the corrected fonts out by the end of this year (i.e. v3.0.0
) but at the moment I'm a bit backlogged.
On Thu, Nov 17, 2022 at 12:21:28AM -0800, Fini wrote:
@kovidgoyal Wow, thank you for the instantaneous answer!
Again let me apologize that we introduce this issue for a lot of users at all. I hoped to get the corrected fonts out by the end of this year (i.e.
v3.0.0
) but at the moment I'm a bit backlogged.
No worries, we are all busy :)
@Finii #773 only adds Material Icons to the new places, but it still has the original glyphs on invalid code points (out of PUA), it seems.
Then #773 has not solved this issue. This issue should be completed when all glyphs placed on the original places will be removed, don't you think?
Yes, according to plan this will come (glyphs removed) with v3.0.0
, this was just one necessary intermediate step.
Now that this is moving into the non Basic Multilingual Plane, it would be useful to also have the UTF-16 notation for these icons in the cheat sheet next to hex as that one can't be used directly in JSON or other text files. For example, nf-md-folder
hex value is f024b
which can't be used as \uf024b
unlike before. From a user perspective that's not very accessible, so having the UTF-16 notation (\udb80\ude4b
) in the sheet would be very useful.
Release is in repo, release as packages pending.