medley Display font coercion?

I'm trying to understand the strategies for instantiating a display charset when there is no actual fontfile with the desired information.

There is a recursive process that is driven by font-mapping variables that specify a search for alternative families and sizes, and then there is code to fake up italic and bold when a (possibly coerced) font with the right face can't be found. (The face fake-up is approximate at best, which is different than the algorithmic transformation that does 90 degree rotations.)

It is hard to track the sequence of events, probably a design that wasn't quite worked out or implemented. So I am chipping away.

To start, there are 2 variables that drive the search for a real character set when the indicated fontfile doesn't exist: MISSINGDISPLAYFONTCOERCIONS and MISSINGCHARSETDISPLAYFONTCOERCIONS. MISSINGDISPLAYFONTCOERCIONS is initialized with the following value: (((GACHA) (TERMINAL)) ((MODERN) (CLASSIC)) ((TIMESROMAN) (CLASSIC)) ((HELVETICA)(MODERN) ((TERMINAL) (MODERN))) So, since the Greek character doesn't really exist in Gacha, the recursive calls will look for Greek in Terminal. And since it doesn't exist in Terminal, it will next look in Modern, and if it doesn't exist there it tries for Classic.

The initial value for MISSINGCHARSETDISPLAYFONTCOERCIONS is (((GACHA) (TERMINAL)) ((MODERN) (CLASSIC)) ((TIMESROMAN) (CLASSIC)) ((HELVETICA) (MODERN)) ((TERMINAL 6) (MODERN 6)) ((TERMINAL 8) (MODERN 8)) ((TERMINAL 10) (MODERN 10)) ((TERMINAL 12) (MODERN 12))) Here the coercions for terminal are restricted by font size, but this includes all of the available sizes I think it is essentially equivalent to the one above, for those sizes. The difference in behavior would only show up if you asked for a different font size, say 14. The one above might end up at CLassic 14, while the one below would not produce anything.

But the current code (\CREATE-REAL-CHARSET.DISPLAY) runs through both coercions at each level of recursion, so I think this is essentially redundant--and confusing.

I haven't found any reference to these coercion variables in the documentation files. Does anybody know why it is broken out in this way? If not I will start by removing one of these.

(But I'll also note one subtle difference in the code: The CHARSET (lower) one does not run on character set 0. But that's just another tighter restriction, and it still falls through to the one that is not size-restricted. For example, it wouldn't end up as a font-not-found failure on size 14, if that's what they were looking for, since it would still it the more general (upper) mapping. And I'm that nobody would understand how to drive this to get more subtle effects.)

So, OK to do a first round of simplification?

May 20 '25 05:05 rmkaplan

Both of those have been around for a long time - and also show up in the FX80 printer driver written(modified?) by HDJ ...

May 20 '25 15:05 nbriggs

The FX80 code only references one of the MISSING* variables, but otherwise seems to implement the same strategy as the display.

But it also references a third display coercion variable, DISPLAYFONTCOERCISIONS (not "MISSING"). This is interpreted in the same way (mapping family/size to other family size) but at a different point in the logic (\CREATECHARSET.DISPLAY). Its initial value is NIL so it is a little harder to understand the actual intent.

Whereas the missing coercions are applied after a failure to find real font information, the DISPLAYFONTCOERCIONS are applied at the top, before the code even looks to see whether there is any real data, even data for character set 0. So suppose that this were to be initialized as one of the ones above, coercing GACHA to TERMINAL to MODERN to CLASSIC, and suppose that character set 0 exists at every point along the chain. As I read the code, it would start by ignoring that Gacha-0 exists, recurse to and ignore Terminal-0, then Modern, then Classic, where it would finally stop the recursion and read the Classic-0 file.

Essentially, a complicated way of mapping Gacha to Classic, ignoring everything along the way. Wouldn't it be simpler just to say (create FONTDESCRIPTOR using (FONTCREATE '(CLASSIC 10)) FONTFAMILY ← 'GACHA). Seems like the recursion isn't doing much work here, that a one-step mapping (GACHA CLASSIC) would be more obvious, if that's the intent. A more reasonable use-case might be to specify a synonym-name for another font no matter how its charsets might be recursively coerced, e.g. a pairwise specification like (SANSFIXED GACHA), interpreted basically as the create-using.

If this is correct, I would remove the recursion for DISPLAYFONTCOERCIONS, just build the target font (with charsets possibly coerced in a priority order) and copy it over. I would also collapse the two MISSING variables into a single DISPLAYCHARSETCOERCIONS variable (and fix the FX-80). The code would recurse through the chain to find the charsets. And then add some documentation, which so far I haven't been able to find.

(After cleaning up this logic, my next move would be to coerce all the charsets for all the fonts offline, and write them all out in my new medleyfont format. I.e. eliminate font coercion as a run-time problem. The Gacha10 file would be as complete as it can be, with some stuff from the source Gacha, some from Terminal,..., or even from a BDF font)

May 20 '25 17:05 rmkaplan

I would have guessed it was done at runtime because we had no guarantee what fonts people would have installed, and because they might want to coerce missing character sets of their fonts (possibly unknown to us) to Xerox fonts which did have those character sets.

I noticed in the CHM PARC archives that Meg Withgott had an IPA font that we can't load anymore because it was written out as an uglyvar and the datatype for FONTDESCRIPTOR has changed. It would be good not to get into that situation again.

May 20 '25 17:05 nbriggs

The (cleaned up) coercion machinery would still be there, but it would not apply to fonts that we are delivering, since those would already have been coerced and complete. So probably those variables would be NIL in the loadups that we produce.

I think we would also want a more fine-grained form of coercion. The current logic only operates at the charset level--if it finds a charset somewhere in the chain, it sticks it in. But charsets themselves can be incomplete, only information//bitmaps for some real characters and black box slugs for others. Those others might be filled in with real data if the chain-search continued.

So, if we can define a predicate (SLUGP character charset), we can test whether there are any slugs in a retrieved charsetinfo. If there's at least one slug, keep recursing until we find charsets with those real characters. My new function MOVEFONTCHARS would then move the information for each real character encountered along the way into the first-located charsetinfo. At the end, we write it all out. This would generally be offline, so speed doesn't matter.

I guess a displayfont slug is a character whose bitmap is all black? Maybe different predicates for other font devices.

This new format depends on the datatype only insofar as it writes out the information for each of its fields, basically as a property list. The reading code stores the information back into the read-time font and charsetinfos. We would always be able to read what is written, we would just be missing any information that new fontdescriptors expect to have.

On May 20, 2025, at 10:19 AM, Nick Briggs @.***> wrote:

nbriggs left a comment (Interlisp/medley#2159) https://github.com/Interlisp/medley/issues/2159#issuecomment-2895254288 I would have guessed it was done at runtime because we had no guarantee what fonts people would have installed, and because they might want to coerce missing character sets of their fonts (possibly unknown to us) to Xerox fonts which did have those character sets.

I noticed in the CHM PARC archives that Meg Withgott had an IPA font that we can't load anymore because it was written out as an uglyvar and the datatype for FONTDESCRIPTOR has changed. It would be good not to get into that situation again.

— Reply to this email directly, view it on GitHub https://github.com/Interlisp/medley/issues/2159#issuecomment-2895254288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSTUJPBLUFPXO6MH6U27LT27NPY7AVCNFSM6AAAAAB5PMVKWOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOJVGI2TIMRYHA. You are receiving this because you authored the thread.

May 20 '25 18:05 rmkaplan

Actually, there is one way in which the medleyfont file may have a dependency on a current datatype definition. A fontdescriptor has a field OTHERDEVICEFONTPROPS that the generic font code doesn't know about.

Since the generic code doesn't know what might be in there, I am using HPRINT to put that out on the file. If there happened to be an embedded datatype that changed in future versions, there could be a problem.

A possible solution is to ask the device implementor to provide a function for writing out that property, making use of the common property writing function (MEDLEYFONT.WRITE.DATA) to structure the output in an always readable way.

Currently, Postscript seems to be the only device that makes use of this, and it seems to be a normal PRINTable list structure except for the fact that it includes an array. As long as it doesn't have a datatype, it should be OK.

May 20 '25 20:05 rmkaplan

Currently, Postscript seems to be the only device that makes use of this

Looking back at the code for POSTSCRIPTSTREAM I'm pretty sure that this field is used only in-memory and doesn't need to be persisted. It is how the FONTDESCRIPTOR holds the extra information from the .PSCFONT file that the imagestream needs to generate proper PostScript.

I'm not sure it is worth the trouble to create a new universal font file format for all devices, which is what it appears that you are proposing. (My apologies if I have misinterpreted.) There can be different font file needs for display fonts in the (near?) future. If we get gray scale or full color display implemented, then fonts may need alpha or RGB info in addition to bitmaps. Or maybe we can use true type fonts directly. In any case, it seems that it is essential that the FONTDESCRIPTOR have an OTHERDEVICEFONTPROPS field to hold device-specific information that is set by the device's font creation mechanism.

May 22 '25 04:05 MattHeffron

The idea is to represent on a file all the information that ends up in a fontdescriptor, of whatever kind from whatever source.

So, to the extent that a fontdescriptor has what it takes to drive a particular device, that information can be written and read from a file.

On May 21, 2025, at 9:43 PM, Matt Heffron @.***> wrote:

MattHeffron left a comment (Interlisp/medley#2159) https://github.com/Interlisp/medley/issues/2159#issuecomment-2899883737 Currently, Postscript seems to be the only device that makes use of this

Looking back at the code for POSTSCRIPTSTREAM I'm pretty sure that this field is used only in-memory and doesn't need to be persisted. It is how the FONTDESCRIPTOR holds the extra information from the .PSCFONT file that the imagestream needs to generate proper PostScript.

I'm not sure it is worth the trouble to create a new universal font file format for all devices, which is what it appears that you are proposing. (My apologies if I have misinterpreted.) There can be different font file needs for display fonts in the (near?) future. If we get gray scale or full color display implemented, then fonts may need alpha or RGB info in addition to bitmaps. Or maybe we can use true type fonts directly. In any case, it seems that it is essential that the FONTDESCRIPTOR have an OTHERDEVICEFONTPROPS field to hold device-specific information that is set by the device's font creation mechanism.

— Reply to this email directly, view it on GitHub https://github.com/Interlisp/medley/issues/2159#issuecomment-2899883737, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSTUJPSCDSRGA3XMKFINZL27VIX7AVCNFSM6AAAAAB5PMVKWOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOJZHA4DGNZTG4. You are receiving this because you authored the thread.

May 22 '25 05:05 rmkaplan

I think the convention used by X implementors was to add another field of "SCALE" to FAMILY, SIZE, FACE, ROTATION, CHARSET.

https://straightrunning.com/XmingNotes/fonts.php (ignore warnings)

May 23 '25 17:05 masinter

I thought that the rendering of "spline" fonts on color or gray scale doesn't rely on BITBLT but rather using the grayscale to get antialiasing (with gray scale) and even sub-pixel use of the color mask pattern. At this point we would probably do what EMacs Lisp did for its unicode rendering of emoji in fonts.

We might be better off trying to revive the XMAS code

May 23 '25 17:05 masinter