xaml-math icon indicating copy to clipboard operation
xaml-math copied to clipboard

Research the dieresis situation

Open ForNeVeR opened this issue 7 years ago • 13 comments
trafficstars

There's strange contradiction I even had to describe in the documentation:

How to know that O 177 is the same as code="196"? To do that, first look into cmmi10.vpl file: there's the following entry:

(CHARACTER O 177 (comment dieresis)
   (CHARWD R 583)
   (CHARHT R 705)
   (CHARIC R 118)
   (MAP
      (SETCHAR O 151)
      )
   )

That means that O 177 is named dieresis. Then, open Adobe Glyph List and search for the dieresis name:

dieresis;00A8

It means that O 177 should be character 0xa8 or 168, not 196. The reason for that contradiction is currently unknown.

We need to research that: what does code="196" mean and what's the reason the diaresis isn't 168?

We need to find the answer and write in into the documentation.

ForNeVeR avatar Jan 27 '18 10:01 ForNeVeR

Typography library may produce support for the automation here.

ForNeVeR avatar Jan 27 '18 10:01 ForNeVeR

If you use .ttf file, I think the Typography may help you.

Feel free to ask/ create a new issue!

prepare avatar Jan 28 '18 02:01 prepare

I thought it was strange as well but it isn't. If we try to render a character that's greater than an ushort, it would be an error, so we have to change where the position of a character to a free position that is less than an ushort's max value. That position may have held a character (say %) but it's code point now holds a new character. This is useful since a lot of Unicode math symbols are too large and for us to use them, that's what we have to do. You could use fontfotge to see the actual glyph, it's metrics, change the glyph and/or others and even what it's code point is meant to hold.

B3zaleel avatar Sep 05 '18 12:09 B3zaleel

I think the issue is that the dieresis, which unicode codepoint is 168 (U+00A8), but its glyph index is 196. /cc @prepare

Edit: Oh, so using Typography is a solution but that's not done yet. I guess this is just a TODO issue then. Nvm.

Happypig375 avatar Sep 05 '18 14:09 Happypig375

We don't know yet if Typography will help us with that case. Someone needs to research the situation.

ForNeVeR avatar Sep 05 '18 15:09 ForNeVeR

I never had to deal with Typeface and Glyph issues because Typography handles all of the table readings in CSharpMath. I don't need any JSON or XML files, just the OTF font files are enough. Plus, integrated typeface importing on-the-go is also possible thanks to Typography. However, if you have other solutions, please do share them.

Happypig375 avatar Sep 05 '18 16:09 Happypig375

Hello,

Do you have latest font?

Can you post the expected/actual result of a dieresis glyph from your app?


I downloaded the cm* font from (https://github.com/ForNeVeR/wpf-math/tree/master/src/WpfMath/Fonts).

And the analyze those fonts with latest Typography branch (https://github.com/LayoutFarm/Typography/tree/post_table_rev)


Font analysis

At this time. ...

  1. All fonts don't have 'kern' table. I read from this ... https://github.com/ForNeVeR/wpf-math/pull/108/files#diff-2dc54592d7800db71c597c416c9a29abR63 => I don't know how to get 'kern' data from them.

  2. Dieresis glyph. (from https://github.com/ForNeVeR/wpf-math/pull/108/files#diff-2dc54592d7800db71c597c416c9a29abR45)

cmmi dose not have a glyph name 'dieresis'

cmm_1

pic 1: cmmi font contains 134 glyphs


Only font that contains 'dieresis' is cmr10.ttf (see pic 2), This font contains only 132 glyphs cmm_2

pic 2: dieresis, in cmr10.ttf, glyphIndex=131

cmm_3

pic 3: compare this with Microsoft VisualTrueType, dieresis, in cmr10.ttf, glyphIndex=131,


prepare avatar Sep 06 '18 00:09 prepare

What happen when we post a dieresis(¨) into textbox


First, I will test it with tahoma font.

cmm_5

pic 4: Tahoma, dieresis, glyph index=142

The following gif shows step-by-step...


2018-09-06_07-41-01

pic 5: dieresis, codepoint 168 => cmap => glyph index = 142

prepare avatar Sep 06 '18 00:09 prepare

What happen when we post a dieresis(¨) into textbox with cmr10.ttf


At this time, As far as I know...

I don't know why the cmap => map to incorrect glyph...

2018-09-06_07-47-49

pic 6: cmr10.ttf, dieresis, codepoint 168 => cmap => incorect glyph ??

prepare avatar Sep 06 '18 00:09 prepare

That is my first report...

I also need to investigate more.

prepare avatar Sep 06 '18 00:09 prepare

@prepare thank you so much for your investigation and for documenting it here.

You've taken the latest font we have (we bundle the font from the repository in our NuGet package), that's right. Actually I can't remember why the dieresis took my attention. Probably it was the first character I've checked or something. Essentially I was looking for an automated way of mapping font characters to their codes in our XML.

Main problem is that we still don't know how these codes were generated, and we can't proceed with auto XML regeneration without that information.

ForNeVeR avatar Sep 06 '18 15:09 ForNeVeR

I think your XML was created from some complex 'Tex' tool.


At this time...

... That means that O 177 is named dieresis

Why dieresis is marked as O 177

We need to go to the original glyph definition of the cm font (https://ctan.org/tex-archive/fonts/cm/mf)

I think dieresis is here => http://mirror.hmc.edu/ctan/fonts/cm/mf/accent.mf (scroll to the end of the page)

cmchar "Umlaut (double dot) accent"; numeric dot_diam#,dot_diam; dot_diam#=max(dot_size#,cap_curve#); beginchar(oct"177",9u#,min(asc_height#,10/7x_height#+.5dot_diam#),0); dot_diam=max(tiny.breadth,hround(max(dot_size,cap_curve)-2stem_corr)); italcorr h#*slant+.5dot_diam#-2.25u#; adjust_fit(0,0); pickup tiny.nib; pos1(dot_diam,0); pos2(dot_diam,90); x1=x2=2.75u; top y2r=h+1; if bot y2l<x_height+o+slab: y2l:=min(y2r-eps,x_height+o+slab+.5tiny); fi y1=.5[y2l,y2r]; dot(1,2); % left dot pos3(dot_diam,0); penpos4(y2r-y2l,90); y3=y4=y1; x3=x4=w-x1; dot(3,4); % right dot penlabels(1,2,3,4); endchar;

beginchar(oct"177",

At that time, it may be called 'Umlaut accent' or 'double dot' accent.

and then some tool map it to the name "dieresis" later.

prepare avatar Sep 06 '18 17:09 prepare

And ...

(CHARACTER C A (CHARWD R 0.750002) (CHARHT R 0.683332) (COMMENT (KRN O 177 R 0.138893) ) )


I try to read the definition of 'A' from http://mirror.hmc.edu/ctan/fonts/cm/mf/romanu.mf

% Character codes \0101 through \0132 are generated. cmchar "The letter A"; beginchar("A",13u#,cap_height#,0); adjust_fit(cap_serif_fit#,cap_serif_fit#); numeric left_stem,right_stem,outer_jut,alpha; right_stem=cap_stem-stem_corr; left_stem=min(cap_hair if hefty: -3stem_corr fi,right_stem); outer_jut=.8cap_jut; x1l=w-x4r=l+letter_fit+outer_jut+.5u; y1=y4=0; x2-x1=x4-x3; x3r=x2r+apex_corr; y2=y3=h+apex_o+apex_oo; alpha=diag_ratio(2,left_stem,y2-y1,x4r-x1l-apex_corr); penpos1(alphaleft_stem,0); penpos2(alphaleft_stem,0); penpos3(alpharight_stem,0); penpos4(alpharight_stem,0); z0=whatever[z1r,z2r]=whatever[z3l,z4l]; if y0<h-cap_notch_cut: y0:=h-cap_notch_cut; fill z0+.5right{down}...{z4-z3}diag_end(3l,4l,1,1,4r,3r) --diag_end(4r,3r,1,1,2l,1l)--diag_end(2l,1l,1,1,1r,2r){z2-z1} ...{up}z0+.5left--cycle; % left and right diagonals else: fill z0--diag_end(0,4l,1,1,4r,3r)--diag_end(4r,3r,1,1,2l,1l) --diag_end(2l,1l,1,1,1r,0)--cycle; fi % left and right diagonals penpos5(whatever,angle(z2-z1)); z5=whatever[z1,z2]; penpos6(whatever,angle(z3-z4)); z6=whatever[z3,z4]; y6=y5; if hefty: y5r else: y5 fi =5/12y0; y5r-y5l=y6r-y6l=cap_band; penstroke z5e--z6e; % bar line if serifs: numeric inner_jut; pickup tiny.nib; prime_points_inside(1,2); prime_points_inside(4,3); if rt x1'r+cap_jut+.5u+1<=lft x4'l-cap_jut: inner_jut=cap_jut; else: rt x1'r+inner_jut+.5u+1=lft x4'l-inner_jut; fi dish_serif(1',2,a,1/2,outer_jut,b,.6,inner_jut)(dark); % left serif dish_serif(4',3,c,1/2,inner_jut,d,1/3,outer_jut); fi % right serif penlabels(0,1,2,3,4,5,6); endchar;

I think the 'kerning info' about letter A and O 177 (dieresis) may be added later from other sources. => I don't know yet.


prepare avatar Sep 06 '18 17:09 prepare