medley icon indicating copy to clipboard operation
medley copied to clipboard

Cleanup of display fonts

Open rmkaplan opened this issue 6 months ago • 3 comments

In the current standard loadup the displayfont files are spread out in a number of directories, as indicated by the value of DISPLAYFONTDIRECTORIES:

fonts>displayfonts> fonts>altofonts> fonts>adobe> fonts>big> fonts>other>

In the past it appears that some files have been copied from one directory to another (often into displayfonts>), so that there are multiple instances of the same file. An initial goal is to have only one directory, fonts>displayfonts> hold all of the files that we need at runtime, and to move all of the copies to corresponding subdirectories of a separate archival directory (e.g. archive/fonts/, which we would not distribute.

To take a simple case, altofonts/timesroman6.strike is a duplicate of displayfonts/c0/timesroman06-mrr-c0.displayfont do the first one would be moved to archive/fonts/altofonts/timesroman6.strike

For the more obscure case that other/c0/sigma20-mrr-c0.displayfont and altofonts/sigma20.strike are duplicates of displayfonts/c0/sigma20-mrr-c0.displayfont the would be moved to archive/fonts/other/c0/sigma20-mrr-c0.displayfont and archive/fonts/altofonts/sigma20.strike

In these examples the files are not only duplicates but their names indicate that they represent the same family/size/face. There is another situation where the different files appear to represent different fonts: displayfonts/c0/helvetica04-mrr-c0.displayfont displayfonts/c0/helvetica01-mrr-c0.displayfont displayfonts/c0/helvetica02-mrr-c0.displayfont are all the same file, with presumably one of these names represents the actual/nominal size of the font and the others suggesting an attempt to fill in for missing fonts. We want to pick one of these to keep, and archive the others. Although our current character-set datatypes don't encode the nominal size (which may or may not be explicit in the file), a reasonable heuristic is to keep the one whose nominal size is closest to the internally represented font height (sum of its ascent and descent). In this case size 4 would be kept, 1 and 2 would be archived. But we would also record the intended substitutions, so that we can fake up 1 and 2 from 4 if there is a FONTCREATE request. We would make entries ((HELVETICA 1)(HELVETICA 4) ((HELVETICA 2)(HEVETICA 4) on the list DISPLAYFONTSUBSTITUTIONS.

When one of the duplicates appears in displayfonts/, we can assume that it was intended to be included as part of the font inventory. But there are still some duplicates that don't correspond to anything in displayfonts/. For example adobe/c0/palatino12-bir-c0.displayfont adobe/c0/palatino09-bir-c0.displayfont Should we copy this file (with an appropriate name) to displayfonts/ and then delete these two?

The same question arises for files that are not duplicated but still only exist in one of the deprecated directories: should we copy to displayfonts/ and delete the source?

rmkaplan avatar Jun 25 '25 17:06 rmkaplan

In

In these examples the files are not only duplicates but their names indicate that they represent the same family/size/face. There is another situation where the different files appear to represent different fonts: displayfonts/c0/helvetica04-mrr-c0.displayfont displayfonts/c0/helvetica01-mrr-c0.displayfont displayfonts/c0/helvetica02-mrr-c0.displayfont

The situation for some of these "smaller" fonts isn't quite the same -- if you want legibility, you can't use a 1-pixel high font. So this isn't quite a "coercion" that might be "repaired". https://LarryMasinter.net https://interlisp.org

On Wed, Jun 25, 2025 at 10:21 AM rmkaplan @.***> wrote:

rmkaplan created an issue (Interlisp/medley#2196) https://github.com/Interlisp/medley/issues/2196

In the current standard loadup the displayfont files are spread out in a number of directories, as indicated by the value of DISPLAYFONTDIRECTORIES:

fonts>displayfonts> fonts>altofonts> fonts>adobe> fonts>big> fonts>other>

In the past it appears that some files have been copied from one directory to another (often into displayfonts>), so that there are multiple instances of the same file. An initial goal is to have only one directory, fonts>displayfonts> hold all of the files that we need at runtime, and to move all of the copies to corresponding subdirectories of a separate archival directory (e.g. archive/fonts/, which we would not distribute.

To take a simple case, altofonts/timesroman6.strike is a duplicate of displayfonts/c0/timesroman06-mrr-c0.displayfont do the first one would be moved to archive/fonts/altofonts/timesroman6.strike

For the more obscure case that other/c0/sigma20-mrr-c0.displayfont and altofonts/sigma20.strike are duplicates of displayfonts/c0/sigma20-mrr-c0.displayfont the would be moved to archive/fonts/other/c0/sigma20-mrr-c0.displayfont and archive/fonts/altofonts/sigma20.strike

In these examples the files are not only duplicates but their names indicate that they represent the same family/size/face. There is another situation where the different files appear to represent different fonts: displayfonts/c0/helvetica04-mrr-c0.displayfont displayfonts/c0/helvetica01-mrr-c0.displayfont displayfonts/c0/helvetica02-mrr-c0.displayfont are all the same file, with presumably one of these names represents the actual/nominal size of the font and the others suggesting an attempt to fill in for missing fonts. We want to pick one of these to keep, and archive the others. Although our current character-set datatypes don't encode the nominal size (which may or may not be explicit in the file), a reasonable heuristic is to keep the one whose nominal size is closest to the internally represented font height (sum of its ascent and descent). In this case size 4 would be kept, 1 and 2 would be archived. But we would also record the intended substitutions, so that we can fake up 1 and 2 from 4 if there is a FONTCREATE request. We would make entries ((HELVETICA 1)(HELVETICA 4) ((HELVETICA 2)(HEVETICA 4) on the list DISPLAYFONTSUBSTITUTIONS.

When one of the duplicates appears in displayfonts/, we can assume that it was intended to be included as part of the font inventory. But there are still some duplicates that don't correspond to anything in displayfonts/. For example adobe/c0/palatino12-bir-c0.displayfont adobe/c0/palatino09-bir-c0.displayfont Should we copy this file (with an appropriate name) to displayfonts/ and then delete these two?

The same question arises for files that are not duplicated but still only exist in one of the deprecated directories: should we copy to displayfonts/ and delete the source?

— Reply to this email directly, view it on GitHub https://github.com/Interlisp/medley/issues/2196, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIQTKZW2O3BE2LFC6QKQ7T3FLLAJAVCNFSM6AAAAACAD5NCROVHI2DSMVQWIX3LMV43ASLTON2WKOZTGE3TMNBQGU2DGNY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

masinter avatar Jun 25 '25 18:06 masinter

There is a similar size coercion for some of the larger fonts, someone's idea of the best approximation for sizes that don't independently exist.

Without looking back at the directories, I think Classic 24 and Classic 36 are the same file--probably 24 is the true size of the font.

rmkaplan avatar Jun 25 '25 20:06 rmkaplan

This is a list of files in the various displayfont directories that are not duplicates of a file already in displayfonts/

https://www.dropbox.com/scl/fi/2vuzqb3bpw1tzzlnajed0/orphans.txt?rlkey=yma83ym9fnsqp5hemqlwjejbm&dl=0

Not sure whether some/all ought to be copied over.

rmkaplan avatar Jun 26 '25 00:06 rmkaplan

Of the 270 or so files that are not duplicates of display fonts, about 115 have names that suggest an XCCS character set, e.g. adobe/c0/palatino10-bir-c0.displayfont big/c356/classic72-brr-c356.displayfont It probably makes sense to copy these over to displayfonts/cxxx/ and merge them together with any other files with the same family/size/face that may already happen to be there.

That leaves about 155 files that may be more difficult to assimilate. They have glyphs of various sorts that may be useful, but it make take some work to figure out whether they conform to any of our known character encodings.

rmkaplan avatar Jun 27 '25 05:06 rmkaplan

I have moved to archivedisplayfonts/ all the files that are duplicates of files in displayfonts/cxxx/.

I then moved over to displayfonts/ (mostly to c0) a few other files that obviously should be there, and standardized their names (cyrillic10.strike → CYRILLIC10-MRR-C0.DISPLAYFONT). I also moved and standardized the fonts in altofonts/thin, giving them COMPRESSED as the last component of the face. (The originals remain in archivedisplayfonts/)

The adobe/palatino fonts are also now moved over/archived. (There is a file DISPLAYFONTCOERCIONS dated 1991 in fonts/adobe that has various family mappings (e.g. NEWCENTURYSCHLBK → CENTURYSCHOOLBOOK). Is that still relevant)

Some directories and subdirectories have ended up empty, and those are deleted and removed from DISPLAYFONTDIRECTORIES.

Basically, that leaves displayfonts/ and a bunch of residual files in altofonts/. The altofonts are confusing. At the top-level of altofonts there are a lot of timesroman, helvetica, and gacha files with names that correspond to files that are in displayfonts/ but are not actually copies. Does any one know anything about those altofont/ files?

Also, how do those relate to the similarly named files elsewhere in altofonts/, in the eightbit/, original/, roundedwidths/ subdirectories? None of them are copies. Should they all be archived, or are some of them preferable to the ones we already have in displayfonts/?

rmkaplan avatar Jun 29 '25 19:06 rmkaplan

Have you pushed a branch with these changes so we can look at how it plays out? (no PR necessary to do that.)

nbriggs avatar Jun 29 '25 19:06 nbriggs

GitHub isn't very good at handling ile moves and subsequent deletions. I think we'd be better off using the "Interlisp/fonts" repository to collect together what we think are the "right" files, and removing the "fonts" subdirectory of the medley repository completely.

I'd start with first populating the new fonts repository with the most recent unix compatible directories of fonts from "envos" (Venue) releases, and then adding in whatever other fonts are needed / used.

when there are files with the same names but different content, it requires some judgement to decide which one is "better".

On the "complete" files: if i have a 72-point font (say times72-mrr) i don't want to pull in the default Kanji 72 fonts. So I think we have to allow some run-time coercion. If we can speed up the true-type to PBM conversion we could build fonts for any size on the fly. I'd like to look at the code for the conversion and see if we can optimize.

masinter avatar Jul 01 '25 18:07 masinter

There are a few subdirectories at the top level of the Envos repo. The only one that seems to include display fonts is xd0e/RELEASE/medley/2.0/fonts/display/

Is there anything else to look at?

This has subdirectories chinese/ JIS1/ JIS2/ miscellaneous/ (includes classicthin hippo logo math oldenglish) presentation/ (includes gacha helvetica timesromand) printwheel/ (includes boldps lettergothic titan) publishing/ (includes classic modern terminal) and the files underneath are labeled with their charset but they are not in charset subdirectories.

  1. Is it safe to assume that files in these directories that byte-differ from files with the same name in our current displayfonts/ are the ones to use? I.e, they should automatically replace our current files and be included in our distribution, without hand-inspection?

  2. Is there any reason to maintain this subdirectory structure?

  3. Eventually, all the character sets of a given family/size/face will be wrapped into a single medleyfont file, so the absence of a charset substructure is moot.

  4. We have many odd families (music, cyrillic, apl, dancer...) that are not instantiated here. We will just keep those as is.

rmkaplan avatar Jul 02 '25 19:07 rmkaplan

These are the font files (.displayfont, .ac, .strike) that I see in the envos repository file lists --

fonts.txt

nbriggs avatar Jul 02 '25 19:07 nbriggs

I compared the files in envos/medley0h/fonts/display with the files in envos/xd0e/release/medley2.0/fonts/display/. They are essentially byte-equivalent.

I say "essentially" because they have slight differences in the implicit font substitutions (helvetica1 using the same files as helvetic4), which we will do algorithmically as needed.

And 3 or 4 files have byte differences that I will look at, to pick which of these directories we should pay attention to.

And then it will be the question of comparing these to our current displayfonts--give these priority if there are differences?

rmkaplan avatar Jul 02 '25 21:07 rmkaplan

Turns out that a few of the files in xd0e were smashed, and that medley0h had 4 more that were actually byte-identical to what we now have.

So I think medley0h is our point of comparison and the source of any new imports.

rmkaplan avatar Jul 02 '25 21:07 rmkaplan

medley0h had 35 font files that were not already included in our displayfonts/. They are mostly garbage--each has a few random characters for a family/size/face/charset combination for which we otherwise had no information at all. For example, a few dashes here and there, greek charsets with only pi, and some that I can't figure out what the symbols are supposed to be.

I think it is harmless (but probably useless) to move these over to displayfonts/, then we will have a superset of everything that was in Envos.

rmkaplan avatar Jul 03 '25 00:07 rmkaplan

The proposed collection and arrangement of files in this first phase of cleanup is in PR #2203

rmkaplan avatar Jul 03 '25 18:07 rmkaplan

This stage was completed in #2203

rmkaplan avatar Jul 22 '25 01:07 rmkaplan