MathJax icon indicating copy to clipboard operation
MathJax copied to clipboard

How to ignore the conversion of some special characters, such as those containing Chinese characters

Open guoyutao opened this issue 1 year ago • 3 comments

Issue Summary

How to ignore the conversion of some special characters, such as those containing Chinese characters.For example, '一个苹果$1一个香蕉$2'. I don't want to deal with this string because it's not a mathematical formula. I just want to show the original string '一个苹果$1一个香蕉$2'.Here is the regular matching method in Chinese.How to determine if a mathematical formula contains Chinese before conversion, and if so, stop the conversion. Or maybe you have a better way, and I don't know.Thank you very much.

function hasChinese(str){ var reg = /[\u4e00-\u9fa5]/g; return reg.test(str); }

Technical details:

  • MathJax Version: 4.0.6
  • Client OS: (e.g., Mac OS X 14.3.1)
  • Browser: (e.g., Chrome)

I am using the following MathJax configuration:

window.MathJax = {
      options: {
        enableMenu: false,
        menuOptions: {
          settings: {
            enrich: false,
            braille: false
          },
        },
        skipHtmlTags: ['script', 'noscript', 'style', 'textarea', 'pre', 'code',
          'a'],
        ignoreHtmlClass: 'tex2jax_ignore',
        processHtmlClass: 'tex2jax_process',
      },
      loader: { load: ['[tex]/texhtml'] },
      tex: {
        allowTexHTML: true,
        packages: { '[+]': ['texhtml'] },
        inlineMath: [
          ['$', '$'],
          ['\\(', '\\)']
        ],
        displayMath: [
          ['$$', '$$'],
          ['\\[', '\\]'],
        ]
      },
      chtml: {
        displayOverflow: 'linebreak',
        displayAlign: 'left',
        // scale: 2.2,
        minScale: .65,
        mtextInheritFont: !0,
        merrorInheritFont: !0,
        skipAttributes: {},
        exFactor: 18,
        displayIndent: "0",
        matchFontHeight: 0,
        adaptiveCSS: !0
      },
      includeHtmlTags: {
        br: "\n",
        wbr: "",
        "#comment": ""
      },
      linebreaks: {                  // options for when overflow is linebreak
        inline: true,                   // true for browser-based breaking of inline equations
        width: '95%',                  // a fixed size or a percentage of the container width
        lineleading: 2,                // the default lineleading in em units
        LinebreakVisitor: null,         // The LinebreakVisitor to use
      },
      output: {
        linebreaks: {
          inline: true,
        },
        font: 'mathjax-modern'
      },
      startup: {
        ready() {
          const { ChtmlMath } = MathJax._.output.chtml.Wrappers.math;
          delete ChtmlMath.styles['mjx-container[jax="CHTML"] mjx-break::after'];
          ChtmlMath.styles['mjx-container[jax="CHTML"] mjx-break'] = {
            'white-space': 'normal',
            'font-family': 'MJX-BRK'
          };
          MathJax.startup.defaultReady();
          const adaptor = MathJax.startup.adaptor;
          MathJax.startup.document.outputJax.postFilters.add(({ data }) => {
            for (const brk of adaptor.tags(data, 'mjx-break')) {
              brk.innerHTML = ' ';
            }
          });
        }
      }
    }

and loading MathJax via

<script type="text/javascript" id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/[email protected]/tex-mml-chtml.js"></script>

    <div ref="exampleRef" class="example_class" v-html="tempstr">
    </div>

    const tempstr = ref('$1一个香蕉$');

    nextTick(() => {
            window.MathJax.typesetPromise([exampleRef.value]);
     });

  .example_class {
    width: 100px;
    background-color: #fff;
  }

guoyutao avatar Jul 25 '24 03:07 guoyutao

I don't think matching against Chinese is really the right answer, here.

When you are using dollar signs within text and you don't want them to be treated as math delimiters, you need to quote them with a backslash, as in \$1一个香蕉\$2. In order to get a literal backslash into a javascript string, you need to use two backslashes:

const tempstr = ref('\\$1一个香蕉\\$2');

the handling of \$ as an escape for $ is controlled by the processEscapes option in the tex block of your configuration. It is true by default, so this should work as is.

dpvc avatar Jul 31 '24 19:07 dpvc

I don't think matching against Chinese is really the right answer, here.

When you are using dollar signs within text and you don't want them to be treated as math delimiters, you need to quote them with a backslash, as in \$1一个香蕉\$2. In order to get a literal backslash into a javascript string, you need to use two backslashes:

const tempstr = ref('\\$1一个香蕉\\$2');

the handling of \$ as an escape for $ is controlled by the processEscapes option in the tex block of your configuration. It is true by default, so this should work as is.

But the text content is not known in advance, it is returned by the ChatGPT. So I can only use regular matching '$...$',and then determine if there is Chinese inside, and then split the character '\' before '$'?

guoyutao avatar Aug 01 '24 08:08 guoyutao

the text content is not known in advance, it is returned by the ChatGPT.

Maybe you can tell it to use \$ to indicate quantities with units of dollars? Or to use \$ for any dollar sign that isn't a LaTeX in-line delimiter? Or to use \(...\) for in-line math delimiters?

I can only use regular matching '$...$',and then determine if there is Chinese inside, and then split the character '' before '$'?

If the above suggestions don't work, then yes, that is probably the way to go. There are valid LaTeX expressions like $\text{This is an $x$.}$ where the dollars are nested, but you probably won't run into those.

dpvc avatar Aug 01 '24 13:08 dpvc