MathCAT icon indicating copy to clipboard operation
MathCAT copied to clipboard

Need Nemeth and UEB braille encodings for new Unicode characters

Open NSoiffer opened this issue 1 year ago • 12 comments

The following characters are proposed be added to Unicode (in 2025). They are used in chemistry are some of them were surprisingly left out of Unicode. Some of these are similar to existing characters but are longer versions -- the short versions aren't the ones used in standard chemistry notation.

@rob-aph: can you tell me what the Nemeth and UEB should be for them?

image

[There is a typo for the code point for the third item above]

image

image

NSoiffer avatar Nov 24 '23 07:11 NSoiffer

I'm blind, so I won't be of much help with these images. If there are some proposed descriptions somewhere, I could look them over and see if anything matches up with what I know.

rob-aph avatar Nov 27 '23 11:11 rob-aph

My apologies, I didn't realize you were blind. Let me try again. Hopefully the descriptions are helpful, but please ask for more info if they aren't:

Chemistry Equilibrium arrows (some added not because of use, but for symmetry):

  • 1F8D0 LONG RIGHTWARDS ARROW OVER LONG LEFTWARDS ARROW
  • 1F8D1 LONG RIGHTWARDS HARPOON OVER LONG LEFTWARDS HARPOON
  • 1F8D2 LONG RIGHTWARDS HARPOON ABOVESHORT LEFTWARDS HARPOON
  • 1F8D3 SHORT RIGHTWARDS HARPOON ABOVELONG LEFTWARDS HARPOON
  • 1F8D4 LONG LEFTWARDS HARPOON ABOVESHORT RIGHTWARDS HARPOON
  • 1F8D5 SHORT LEFTWARDS HARPOON ABOVELONG RIGHTWARDS HARPOON

Arrows for unsuccessful reactions:

  • 1F8D6 LONG RIGHTWARDS ARROW WITH THROUGH X
  • 1F8D7 LONG RIGHTWARDS ARROW WITH DOUBLE SLASH
  • 1F8D8 LONG LEFT RIGHT ARROW WITH DEPENDENT LOBE

The last symbol above is kind of strange. It is a not quite closed ellipse that hangs down from the center of the arrow.

The last proposed symbol is the standard state symbol:

  • 29B6 MEDIUM SMALL WHITE CIRCLE WITH HORIZONTAL BAR

A little more discussion about the last symbol:

There is a slightly similar existing character: U+29B5, which has been identified by nameslist annotation as the character to use for standard state. However, as can be seen, this identification appears to be an example of arms’ length unification. The glyph for U+29B5 is much too large, and the standard state symbol should also have a significantly longer horizontal line (as mentioned in the Wikipedia link).

NSoiffer avatar Nov 30 '23 21:11 NSoiffer

@rob-aph: are you able to generate these? I'm hoping to do another release mid week.

NSoiffer avatar Dec 10 '23 06:12 NSoiffer

So sorry, my ability to participate toward the end of the year is limited. I'll look these over shortly and see if I can come up with reasonable translations.

rob-aph avatar Dec 15 '23 10:12 rob-aph

OK. This is mostly above my experience level, but I took a stab anyway. Seems like many of these would have specific braille symbols, but it looks like they don't, so I synthesized them from section 13 in the GTM. Here's what I have so far for Unicode:

1F8D0 ⠳⠒⠒⠒⠪⠨⠔⠳⠒⠒⠒⠕
1F8D1 ⠳⠒⠒⠒⠠⠗⠪⠨⠔⠳⠒⠒⠒⠈⠗⠕
1F8D2 ⠳⠠⠗⠪⠨⠔⠳⠒⠒⠒⠈⠗⠕
1F8D3 ⠳⠒⠒⠒⠠⠗⠪⠨⠔⠳⠈⠗⠕
1F8D4 ⠳⠈⠗⠕⠨⠔⠳⠒⠒⠒⠠⠗⠪
1F8D5 ⠳⠒⠒⠒⠈⠗⠕⠨⠔⠳⠠⠗⠪

I'll continue next week, but I'm not sure I can come up with anything for those last ones. I think we need to find someone to review this.

rob-aph avatar Dec 15 '23 17:12 rob-aph

Thanks! If you can provide Nemeth encodings, that would be really helpful also.

NSoiffer avatar Dec 15 '23 18:12 NSoiffer

It turns out I already had three of them in my UEB table. They come from GTM 16.5. The pictures are somewhat short arrows, which is incorrect for chemistry. The standard convention is to use long arrows. For the characters of 1F8D1, they use dots 45-456-2356. I'm not sure what to use. In any case, then patterns they use are much shorter than the ones you proposed.

I'll wait for your review before adding them to UEB.

NSoiffer avatar Dec 16 '23 22:12 NSoiffer

I'm wondering, what is the visual difference between 1F8D1 and 21CC? I assumed that 21CC maps to ⠘⠸⠶ as you have it in your table, so it doesn't seem right that 1F8D1 should map to the same braille characters.

Yes, their version is shorter than mine, because they invented a combination (kind of like a contraction I guess) for that symbol.

rob-aph avatar Dec 18 '23 12:12 rob-aph

U+21CC is short Rightwards Harpoon Over Leftwards Harpoon. U+1F8D1 is long Rightwards Harpoon Over Leftwards Harpoon. That is, the only difference is short vs long. The long version is what is used in chemistry. Perhaps because there was no Unicode char, GTM uses a short equilibrium arrow, but a long one is really the appropriate one to use. It isthe one that the mhchem package for TeX synthesizes.

So there is a dilemma: use the likely inappropriate one used in the spec for what would be generated by TeX (and eventually other packages when the character makes it way into fonts) or go with the correct but longer one and claim the spec got it wrong.

Any thoughts?

NSoiffer avatar Dec 18 '23 19:12 NSoiffer

I guess you could go with U+21CC until U+1F8D1 makes it into the Unicode spec, or use both with a comment in the table.

rob-aph avatar Dec 18 '23 23:12 rob-aph

After more reading and looking at examples in the GTM, I would like to amend my UEB translations to the following:

1F8D0 ⠳⠒⠒⠒⠕⠻⠳⠒⠒⠒⠪
1F8D1 ⠳⠒⠒⠒⠈⠗⠕⠻⠳⠒⠒⠒⠠⠗⠪
1F8D2 ⠳⠒⠒⠒⠈⠗⠕⠻⠳⠒⠠⠗⠪
1F8D3 ⠳⠒⠈⠗⠕⠻⠳⠒⠒⠒⠠⠗⠪
1F8D4 ⠳⠒⠒⠒⠈⠗⠪⠻⠳⠒⠠⠗⠕
1F8D5 ⠳⠒⠈⠗⠪⠻⠳⠒⠒⠒⠠⠗⠕
1F8D6 ⠳⠒⠒⠒⠕⠯⠠⠭
1F8D7 ⠳⠒⠒⠒⠕⠯⠣⠸⠌⠸⠌⠜
1F8D8: (no idea)
29B6 ⠐⠠⠤⠯⠫⠿

Notes: All of these either require grade 1 symbol mode in places, grade 1 word mode, or grade 1 passage mode. (I left out all of such indicators.) Example: 1F8D6 should probably be in passage mode.

Also, with 1F8D6, I could not find an answer to whether or not the capital X, dot 6 dots 1 3 4 6) needs grouping symbols around it. Strangely the spec doesn't specify capital letters as an item. I assume grouping indicators are not required here. Or perhaps a small letter x is what is needed here?

It looks like U+29B6 is already claimed (Circled Vertical Bar.) if that matters.

I'll try to work on the Nemeth versions of these before the end of the year.

rob-aph avatar Dec 18 '23 23:12 rob-aph

It looks like U+29B6 is already claimed (Circled Vertical Bar.) if that matters.

Typo -- should be 2B96

NSoiffer avatar Dec 30 '23 04:12 NSoiffer

@rob-aph: any chance you can figure out the Nemeth for the chars mentioned above (https://github.com/NSoiffer/MathCAT/issues/228#issuecomment-1834606180)? I'd like to close this issue out.

NSoiffer avatar Sep 11 '24 04:09 NSoiffer

Hi Neil,

I am afraid I don't have Nemeth translations for these characters. Our project was terminated in January 2024, and I was unable to make any more progress up to that point.

rob-aph avatar Sep 11 '24 10:09 rob-aph

I'm sorry to hear that. Does that mean APH is no longer using MathCAT?

NSoiffer avatar Sep 11 '24 18:09 NSoiffer

I believe we are using it in Braille Blaster. (I'm not on the Braille Blaster team, so I really don't know much about the product, but I believe they were and probably are still using MathCAT.)

rob-aph avatar Sep 11 '24 18:09 rob-aph

Michael Whapples works on that. I thought you worked with him. What project were you on that got canceled? Are you no longer using MathCAT. Getting bug reports from you was very helpful.

FYI: I've closed this because I found Nemeth symbols in the new BANA Nemeth chemistry guidelines.

NSoiffer avatar Sep 11 '24 19:09 NSoiffer

The project I worked on was called Dots123. It was kind of a word processor for braille math, with translation in both directions, and importing/exporting Word files.

I am not currently using MathCAT in any of my projects, but I'm glad I was able to help somewhat. We likely would have used MathCAT for reverse translation as well if that had been available -- I saw it on your roadmap.

I'm glad you found those symbols; mine would have been highly speculative.

rob-aph avatar Sep 13 '24 11:09 rob-aph