manim icon indicating copy to clipboard operation
manim copied to clipboard

Fix #3548: Match MathTex subscripts/superscripts by position

Open Nikhil172913832 opened this issue 1 month ago • 12 comments

Overview: What does this pull request change?

Fixes issue where MathTex submobjects did not correctly correspond to their tex_strings when using subscripts and superscripts in different orders. The fix uses geometric position matching specifically for script elements (^, _) to handle LaTeX's reordering while preserving sequential matching for non-script elements.

Motivation and Explanation: Why and how do your changes improve the library?

Problem: LaTeX compiles expressions like A ^n _1 and A _1 ^n to identical SVG output where subscripts and superscripts may appear in a different order than specified. This caused MathTex('A', '^n', '_1') and MathTex('A', '_1', '^n') to have submobjects that didn't match their original tex_strings, breaking operations like get_parts_by_tex() and set_color_by_tex().

Solution: Modified _break_up_by_substrings() method to detect script elements (tex strings starting with ^ or _) and match them to rendered submobjects based on geometric position (center point). Non-script elements continue using sequential matching to maintain backward compatibility and avoid issues with complex formulas.

Impact: Users can now reliably access and manipulate subscripts/superscripts by their tex strings regardless of the order they're specified.

Links to added or changed documentation pages

No documentation changes required.

Further Information and Comments

  • Fixes #3548
  • All existing tests pass (20/20 in test_texmobject.py)
  • Added regression test: test_tex_strings_with_subscripts_and_superscripts()

Reviewer Checklist

  • [ ] The PR title is descriptive enough for the changelog, and the PR is labeled correctly
  • [ ] If applicable: newly added non-private functions and classes have a docstring including a short summary and a PARAMETERS section
  • [ ] If applicable: newly added functions and classes are tested

Nikhil172913832 avatar Oct 25 '25 10:10 Nikhil172913832

Thanks for the PR.

I have tried to run your new test without the suggested changes to the tex_mobject.py file. I would expect the test to reveal an issue with the matching process. But on my computer the test passes without any issues...

Similarly I have tested the full PR on the two scenes (Minimal and MinimalWithSum) reported in issue https://github.com/ManimCommunity/manim/issues/3548. Both scenes still fail even with the changes from this PR active. Do you get similar results?

henrikmidtiby avatar Oct 31 '25 21:10 henrikmidtiby

@henrikmidtiby Thanks for pointing that out. I had initially overlooked the fix, and since the test passed, I missed verifying whether the original issue was actually resolved. I’ve now updated the test and revised my approach. The test correctly fails on the main branch now.

Nikhil172913832 avatar Nov 01 '25 04:11 Nikhil172913832

Good progress.

I have tried to apply the current PR to the following test case (from https://github.com/ManimCommunity/manim/issues/3548)

from manim import *

class MinimalWithSum(Scene):
    def construct(self):
        """ This shows that substring may not correspond to tex shape """
        t2cm = {'\sum': BLUE, '^n': RED, '_1': GREEN, 'x':YELLOW}
        eq1 = MathTex('\sum', '^n', '_1', 'x', tex_to_color_map=t2cm)
        eq2 = MathTex('\sum', '_1', '^n', 'x', tex_to_color_map=t2cm)

        font = {'font_size': 24}
        txts = [Text(sub.get_tex_string(), t2c=t2cm, **font) for sub in (eq1, eq2) for i in range(len(sub))]
        txt1 = VGroup(*txts[:4])
        txt2 = VGroup(*txts[4:])

        cap1 = Text('tex rendered', **font)
        cap2 = Text('tex substrings', **font)
        
        grp = VGroup(cap1, cap2, eq1, txt1, eq2, txt2).arrange_in_grid(3,2)
        grp.scale(2).move_to(ORIGIN)
        self.add(grp)

Which renders as shown here.

MinimalWithSum_ManimCE_v0 19 0

Which is more consistent than if I render the scene using the current main branch, that produces this output.

MinimalWithSum_ManimCE_v0 19 0

I still think that the coloring is off in both cases, as I would expect the summation signs to be blue.

In addition I wonder if it is possible to extract some of the functionality into a separate method. The intention here is to make it easier to understand what the code is actually doing. Prior to this PR I had to pay close attention to understand the 26 lines of code in the _break_up_by_substrings method. The method is now close to 100 lines and I haven't yet managed to really understand what is happening (e.g. why should the order of the sorted_pool be reversed in some cases).

henrikmidtiby avatar Nov 03 '25 17:11 henrikmidtiby

Screenshot from 2025-11-04 12-20-04 @henrikmidtiby does it look correct now?

Nikhil172913832 avatar Nov 04 '25 06:11 Nikhil172913832

@Nikhil172913832 Much better! This is exactly what I would expect from reading the code for the MinimalWithSum scene.

henrikmidtiby avatar Nov 04 '25 07:11 henrikmidtiby

@henrikmidtiby I’ve made the necessary changes related to the colors. Please let me know if everything looks good.

Nikhil172913832 avatar Nov 06 '25 08:11 Nikhil172913832

Nice to see your progress on this. I have attempted to find an example where the colors of the parts of the MathTex is assigned in an unwanted way. Until now I haven't been successful at that. However I have found this example, where parts of the extracted tex strings seems to duplicated in certain conditions.

from manim import *

class MinimalWithSumDifficult(Scene):
    def construct(self):
        """ This shows that substring may not correspond to tex shape """
        t2cm = {'\sum': BLUE, '^n': RED, '_1': GREEN, 'x':YELLOW}
        eq1 = MathTex(r'\sum', '^n', '_1', 'x', '^2', '= n_2', tex_to_color_map=t2cm)
        eq2 = MathTex(r'\sum', '_1', '^n', 'x', '^2', '= n_2', tex_to_color_map=t2cm)

        font = {'font_size': 24}
        txts = [Text(sub.get_tex_string(), t2c=t2cm, **font) for sub in (eq1, eq2) for i in range(len(sub))]
        for txt in txts: 
            print(txt)
        txt1 = VGroup(*txts[:4])
        txt2 = VGroup(*txts[4:])

        cap1 = Text('tex rendered', **font)
        cap2 = Text('tex substrings', **font)
        
        grp = VGroup(cap1, cap2, eq1, txt1, eq2, txt2).arrange_in_grid(3,2)
        grp.scale(1).move_to(ORIGIN)
        self.add(grp)

On my computer it renders as shown here:

MinimalWithSumDifficult_ManimCE_v0 19 0

It seems like the strings "^n" and "_1" have been duplicated in the lower equation.

henrikmidtiby avatar Nov 06 '25 22:11 henrikmidtiby

Now I managed to find a case, where the new code seems to render the equation badly.

from manim import *

class MathTexUnexpectedBehaviour(Scene):
    def construct(self):
        t = MathTex("\\int^b{{_a}} dx = b - a")
        self.add(t)

        t[1].set_color(RED)

Which renders as

MathTexUnexpectedBehaviour_ManimCE_v0 19 0

Where I miss the upper limit of the integral. The issue disappears if the limits of the integral are interchanged.

from manim import *

class MathTexUnexpectedBehaviour(Scene):
    def construct(self):
        t = MathTex("\\int{{_a}}^b dx = b - a")
        self.add(t)

        t[1].set_color(RED)
MathTexUnexpectedBehaviour_ManimCE_v0 19 0

henrikmidtiby avatar Nov 06 '25 22:11 henrikmidtiby

@henrikmidtiby,, addressing your first issue:

In the original line:

txts = [Text(sub.get_tex_string(), t2c=t2cm, **font) for sub in (eq1, eq2) for i in range(len(sub))]

the expression for sub in (eq1, eq2) iterates over the MathTex objects themselves rather than their submobjects. Meanwhile, for i in range(len(sub)) loops over the number of submobjects, but Text(sub.get_tex_string(), ...) still calls .get_tex_string() on the full object instead of each submobject.

A better approach would be:

from manim import *

class MinimalWithSumDifficult(Scene):
    def construct(self):
        t2cm = {r'\sum': BLUE, '^n': RED, '_1': GREEN, 'x': YELLOW}
        eq1 = MathTex(r'\sum', '^n', '_1', 'x', '^2', '= n_2', tex_to_color_map=t2cm)
        eq2 = MathTex(r'\sum', '_1', '^n', 'x', '^2', '= n_2', tex_to_color_map=t2cm)

        font = {'font_size': 24}
        # convert each submobject to Text, arrange horizontally
        def make_text_group(eq):
            txts = [Text(str(mob.tex_string), t2c=t2cm, **font) for mob in eq]
            for i, t in enumerate(txts):
                t.next_to(txts[i - 1], RIGHT, buff=0.1) if i > 0 else None
            return VGroup(*txts)

        txt1 = make_text_group(eq1)
        txt2 = make_text_group(eq2)

        cap1 = Text('TeX rendered', **font)
        cap2 = Text('TeX substrings', **font)

        grp = VGroup(
            cap1, cap2,
            eq1, txt1,
            eq2, txt2
        ).arrange_in_grid(rows=3, cols=2, buff=0.6)

        grp.move_to(ORIGIN)
        self.add(grp)

Which renders as: image

Nikhil172913832 avatar Nov 07 '25 06:11 Nikhil172913832

@henrikmidtiby, addressing your second issue:

MathTex("\\int{{_a}}^b dx = b - a")

was missing the upper limit b because my code was incorrectly trying to pair \int^b with _a as a base+scripts group.

After looking into it, I found that the base element \int^b already contained its own superscript ^b, but my code didn’t detect this and still applied the base+scripts logic, consuming submobjects incorrectly.

I added a check to skip the base+scripts logic if the base element’s tex_string already contains ^ or _, indicating that it already has scripts attached.

Now:

from manim import *

class MathTexUnexpectedBehaviour(Scene):
    def construct(self):
        t = MathTex("\\int^b{{_a}} dx = b - a")
        self.add(t)
        t[1].set_color(RED)

renders as: image

Nikhil172913832 avatar Nov 07 '25 07:11 Nikhil172913832

@Nikhil172913832 Thanks for addressing the two issues. This PR clearly improves how the tex_to_color_map option is handled, and makes it usable in more situations.

I used some time to search for an example where the PR would fail. It took some time but eventually I found the following:

from manim import *

class MinimalWithSumVeryDifficult(Scene):
    def construct(self):
        t2cm = {r'\sum': BLUE, 'n_2': RED, '_1': GREEN}
        eq1 = MathTex(r'\sum^{n_2^3}_1', tex_to_color_map=t2cm)
        eq2 = MathTex(r'\sum_1^{n_2^3}', tex_to_color_map=t2cm)

        font = {'font_size': 24}
        # convert each submobject to Text, arrange horizontally
        def make_text_group(eq):
            txts = [Text(str(mob.tex_string), t2c=t2cm, **font) for mob in eq]
            for i, t in enumerate(txts):
                t.next_to(txts[i - 1], RIGHT, buff=0.1) if i > 0 else None
            return VGroup(*txts)

        txt1 = make_text_group(eq1)
        txt2 = make_text_group(eq2)

        cap1 = Text('TeX rendered', **font)
        cap2 = Text('TeX substrings', **font)

        grp = VGroup(
            cap1, cap2,
            eq1, txt1,
            eq2, txt2
        ).arrange_in_grid(rows=3, cols=2, buff=0.6)

        grp.move_to(ORIGIN)
        self.add(grp)

I don't know if it is possible to make this work for all potential cases, without reimplementing most parts of the external latex parser. I don't think that would be worth the effort though.

In search of an alternative, I managed to to find a post in the #dev-chat on the manim discord server, where Benjamin Hackl mentioned a potentially more stable approach some time ago. https://discord.com/channels/581738731934056449/1023550532914266142/1406959019868029043

Benjamin Hackl — 8/18/25, 1:12 PM
I learned something completely insane yesterday, which stronly motivates completely rewriting Tex and friends. did any of you know that it is possible to insert commands in a given TeX code that are being picked up by dvisvgm? we can actually insert a bunch of <g id="manim-group-xyz"> </g> in the SVG produced by dvisvgm 👀

uwezi — 8/18/25, 2:23 PM
how? That sound quite useful!

Benjamin Hackl — 8/18/25, 2:27 PM
indeed, and i think it resolves all sort of TeX-splitting issues; from the bit of testing I did yesterday it actually seemed quite robust. And no need to artificially split TeX strings anywhere...

the command is simply
\special{dvisvgm:raw <g id="something-unique">}
...
\special{dvisvgm:raw </g>}

and after generating the corresponding svg via dvisvgm the glyphs resulting from the TeX code in between should be wrapped in a proper svg group with the given id. 👀

henrikmidtiby avatar Nov 08 '25 20:11 henrikmidtiby

@henrikmidtiby Thanks for the detailed feedback and for sharing that Discord thread. I agree — handling every TeX edge case isn’t practical without deeper parsing. It makes sense to wait for the \special{dvisvgm:raw} approach rather than adding a temporary fix.

Nikhil172913832 avatar Nov 09 '25 05:11 Nikhil172913832