RFC: New Text API
Currently, the text API which uses Pango isn't great and I would like to refactor it. Here is my proposal:
-
New Class:
FormattedString- It's similar to a list of strings but would give more control over the formatting of each string. Initialising it would require the following optional parameters:-
text: The text to show. This would be added as the first in the list. -
font: The fonts of the strings. Default when not overridden individually. (much like howtextinMarkupTextcurrent is) -
font_size: The size of the font. Default when not overridden individually. -
fallback_font: Font to fall back to when the main one isn't available or doesn't contain a specific character. This can also be overridden individually. -
color: The color of the text. Default to white. Probably should be overridden by themes. -
strikethrough: whether to strikethrough -
strikethrough_color: the color of strikethrough -
insert_hypens: use hyphens when breaking a work (default to True) -
underline: Underline the text. -
overline: Draw an overline on text. -
underline_color: Color of underline -
overline_color: Color of overline -
baseline_shift: Set's baseline shift for a given text -
open_type_features: which features to enable -
language: The language of the text -
indent: the indent of the para -
style: styles of text -
weight: Font weight -
letter_spacing: Space between letters
and more see https://gnome.pages.gitlab.gnome.org/gtk/Pango/enum.AttrType.html
-
-
The
FormattedStringwill have the functions:-
clear: clear all the texts stored -
append: Add a new text. Accepts all arguments from__init__().
-
-
Addition should be possible with
FormattedString. If the addition isFormattedStringandFormattedStringit should add items from second to the first one. If the addition isFormattedStringandstra new text should be added to the original one. -
The
Textclass which exists should accept anstrand aFormattedString. The string passed here will be considered as Pango Markup. -
The
MarkupTextclass will be deprecated. -
All the
t2*parameters are deprecated and removed from theTextclass. -
Paragraphwill be aliased toText -
Textwill have new parameterswidthandheightwhich will be considered when creating text and paragraphs would be broken based on this. By defaultwidthandheightshould be full screen. -
Textwill be treated as a paragraph, which means it has parameters like -justify: Justify paragraph -align: Align the text should beleft,right,center-line_height: Height of each line. -
Textshould also accept parameters fromFormattedStringfor backwards compatibility.
Example Code
class TextScene(Scene):
def construct(self):
a = FormattedString(font="Roboto")
a += "Hello"
a += "World"
a.append("bold", weigth=BOLD)
self.add(Text(a))
CC: @PhilippImhof
This seems fairly reasonable, and like a good development to me. Some comments and questions:
- The
underline_colorandunderlinekeyword arguments (etc) could potentially be combined to justunderlinefor convenience. - It would be very nice if
Text(and probably alsoFormattedString) hadtextas a property. We often get requests for supporting syntax likemy_text.animate.set(text=...), and that might be a clean way to do that. - About setting width and height of texts automatically: Would this mean that the width and height of the mobject
Text("hello!")would also be the full screen size? Or would these "text width" and "text height" only be used for text formatting? Would it make sense to separate the behavior forText(not respecting any width/height by default) andParagraph(for which width/height is respected automatically, and if not specified they are set to the scene dimensions)? - What would be the result of
FormattedString("hello", color=YELLOW) + FormattedString("world", underline=True, underline_color=RED)?
The underline_color and underline keyword arguments (etc) could potentially be combined to just underline for convenience.
👍. Looks good, probably a dataclass representing them would be better.
It would be very nice if Text (and probably also FormattedString) had text as a property. We often get requests for supporting syntax like my_text.animate.set(text=...)�, and that might be a clean way to do that.
Should be possible but I guess only for Text. I don't think FormattedString needs something like that.
About setting width and height of texts automatically: Would this mean that the width and height of the mobject Text("hello!") would also be the full screen size? Or would these "text width" and "text height" only be used for text formatting?
The size of "hello" would depend on the font_size parameter and not width and height. Actually, width and height would only make a difference when there is a multiline text or a long line. If the line exceeds the screen(or the specified width) it will be wrapped to the next line. Probably calling it bbox is better I think. Thoughts?
Would it make sense to separate the behavior for Text (not respecting any width/height by default) and Paragraph (for which width/height is respected automatically, and if not specified they are set to the scene dimensions)?
As explained above width/height would make difference only for long sentences or paragraphs I think it's better to have Text the same as Paragraph. That would mean, Paragraph is just an alias for Text.
What would be the result of FormattedString("hello", color=YELLOW) + FormattedString("world", underline=True, underline_color=RED)``?
hello would be in YELLOW color and world would be the default color (ie white) and underlined in RED color.
I like the approach very much, it has several very good points:
- In order to bring non-LaTeX (I will not continue to explicitly say it for the remainder of my comment) text to the next level, it seems important for me to get rid of the old (archaic)
t2*formatting. While I consider PangoMarkup like inMarkupTextas the easiest solution, usingPangoAttrTypewill eventually offer more possibilities AFAICS. Your approach would both and thus bring the best of both worlds. - I like the idea of the
+operator. However, as @behackl already wrote, I see that there is a huge risk of "unexpected" behaviour. If I have the yellow textHELLOand I add justWORLDto it, I would expect it to be yellow as well, unless I explicitly orderedWORLDto have another colour. Others might expect something else. - Having Paragraph adapted to the new format (and thus working with PangoMarkup) would allow for easy typesetting of code, because we are already using Pygments and Pygments has a PangoMarkup formatter.
I see that there is a huge risk of "unexpected" behaviour.
What would be the better approach here to avoid confusion?
I do not have one. I even doubt there is one, because one might expect several things and all of them could be reasonable. One thing I can see is that we might distinguish between attributes that have been explicitly set (those should persist) and attributes that have their default value (they should be superseded by the existing value):
-
FormattedString("hello", color=YELLOW) + FormattedString("world")should be all in yellow, because no color is set for the second part -
FormattedString("hello", color=YELLOW) + FormattedString("world", color=WHITE)should havehelloin yellow andworldin white.
However, one might argue that this is not to be expected either, because FormattedString("world") is white by default and it should maybe remain white even if added to some other text.
Some personal opinions on the new implementation of Text:
- Due to the existance of ligatures, the result of slicing mobjects may be not expected (e.g.
Text("flame")[:2]may contain 3 characters). So let's just give up slicing through text string itself, let alone markup tags that may be contained. - I'd suggest to introduce in a third-party package, svgelements, for svg parsing --- in ManimGL I've committed a PR to do so, and the code becomes much cleaner. Although this parser still has something not supported, it provides more than current parsing. However, I'm not sure if CE would like to include another third-party package.
- The user should be able to control where a long string should wrap with
\ncharacter. Currently in both ManimGL and ManimCE a long text will be automatically wrapped, due to thewidthparameter passed to cairo surface (See the following picture, using ManimGL. In ManimCE, the font is scaled so that a line can hold more words, but to a certain extent it also wraps). The side effect brought by this parameter should be eliminated. - I'm not in favor of the syntax
FormattedString("hello") + " world", which appears confusing to me. The+operator inFormattedString("hello") + FormattedString("world")can just inherit fromVMobject + VMobject.
intro_words = Text("""
The original motivation for manim was to
better illustrate mathematical functions
as transformations.
""")
self.add(intro_words)

I'd suggest to introduce in a third-party package, svgelements, for svg parsing --- in ManimGL I've committed a https://github.com/3b1b/manim/pull/1731 to do so, and the code becomes much cleaner. Although this parser still has something not supported, it provides more than current parsing. However, I'm not sure if CE would like to include another third-party package.
Yes. We should not reinvent the wheel with SVG parsing given how complex it is. Replacing the current parser with a third-party library would without a doubt be healthy for the library! svgelements in particular has also been mentioned in the related issue, #1040.
So let's just give up slicing through text string itself, let alone markup tags that may be contained.
Yeah, I plan to drop support for slicing, instead, users can use the Markup or use FormattedString as proposed here.
The user should be able to control where a long string should wrap with \n character. Currently in both ManimGL and ManimCE a long text will be automatically wrapped, due to the width parameter passed to cairo surface (See the following picture, using ManimGL. In ManimCE, the font is scaled so that a line can hold more words, but to a certain extent it also wraps). The side effect brought by this parameter should be eliminated.
I think it would be better to wrap the lines if it exceeds the screen size, rather than a fixed width as done currently. Also, there would be a parameter to not automatically wrap lines in FormattedString so that users can control it by adding new lines manually.
I'm not in favor of the syntax FormattedString("hello") + " world", which appears confusing to me. The + operator in FormattedString("hello") + FormattedString("world") can just inherit from VMobject + VMobject.
I don't think FormattedString will be a Mobject since it doesn't render anything, but instead, it can be combined in Text to render. Also, are VMobject are addable (ie is VMobject + VMobject even possible?)?
I wanted FormattedString to inherit from str since it's literally storing a string but with extra data and that would mean it should support adding str as well. I think if it creates a lot of confusion we can avoid doing it? I would like to hear more thoughts about that.
Oh, I regarded FormattedString as a child class of Mobject. Sorry for misleading. In that case it becomes reasonable for me.
Also, are VMobject are addable (ie is VMobject + VMobject even possible?)?
VMobject.__add__(self, vmobject) is implemented which returns a new VMobject. See code here. But never mind then...
Another idea for this Text refactor: in regards to fixing AddTextWordByWord, maybe indexing a Text mobject should return each word separately?
maybe indexing a Text mobject should return each word separately?
I think it would break the Write animation then.
Just to post some updates here, I have done some work on ManimPango; see https://github.com/ManimCommunity/ManimPango/milestone/1. It is a huge refactor, would take a lot of time and is still a WIP.