manim icon indicating copy to clipboard operation
manim copied to clipboard

update: Refactoring Code-class and new Custom pygments based formatter

Open OliverStrait opened this issue 1 year ago • 1 comments

Overview: What does this pull request change?

  • Refactoring Code class
  • Added new Formatter class CodeColorFormatter
  • Added testes and new utility-file to test reading and formating from code file.

Motivation and Explanation: Why and how do your changes improve the library?

  • Old Code parser is very convoluted and hard to understand code of custom python parser with poor naming, and jumpping and overlapping procedure. It used pygments HtmlFormatter as backend and after parsed html to list[tuple[str,str]] structure which is used for coloring.
  • New custom formatter skips html and transforms code directly to list-mapping using pygments lexers and tokenizer functions.
  • Better naming and capsulating responsibility to a logical chuncks.
  • New system is just about 2 to 3 times faster, but Mobject constructors are still performance bottlenecks

code_class-new_perf

Links to added or changed documentation pages

Further Information and Comments

  • If Paragraph could support color to text mapping we could remove whole block of re-coloring code after object-construction (little performance boost).
  • Tested with pygments 2.18.0 and python language ** There may be some odd cases when pygments Tokenizer sneak newlines to unexpected places. Testes with different syntax and languages would be good to perform. ** If uncatched newlines is passed to Paragraph inside of string literals it will broke re-coloring an line-numbers.

Reviewer Checklist

  • [ ] The PR title is descriptive enough for the changelog, and the PR is labeled correctly
  • [ ] If applicable: newly added non-private functions and classes have a docstring including a short summary and a PARAMETERS section
  • [ ] If applicable: newly added functions and classes are tested

OliverStrait avatar Aug 13 '24 17:08 OliverStrait

  • I made parser more general to handle odd cases where tokenizer produce string literals with newlines hidden inside.
  • Somehow Paragraph class cannot handle empty newlines at end of string (therefore line numbers will missalign) so I added parser to remove those.
  • (also fixed messed up git history)

OliverStrait avatar Aug 22 '24 15:08 OliverStrait

I'll close this as we now have #4115

JasonGrace2282 avatar Jan 19 '25 18:01 JasonGrace2282