toml icon indicating copy to clipboard operation
toml copied to clipboard

Compile section matching regex in encoder.py for performance.

Open tirkarthi opened this issue 3 years ago • 0 comments

During encoding the regex pattern is not compiled and is used with re.match directly. Compiling the regex once and using it later will improve the performance

tmp.toml : https://github.com/uiri/toml/blob/master/examples/example-v0.4.0.toml

Below are numbers in Python 3.8 :

$ python -m pyperf timeit -s 'from toml import loads, dumps; data = loads(open("tmp.toml").read())' 'dumps(data)' -o uncompiled.json
.....................
$ python -m pyperf timeit -s 'from toml import loads, dumps; data = loads(open("tmp.toml").read())' 'dumps(data)' -o compiled.json
.....................
$ python -m pyperf compare_to uncompiled.json compiled.json --table                                                               
+-----------+------------+----------------------+
| Benchmark | uncompiled | compiled             |
+===========+============+======================+
| timeit    | 457 us     | 376 us: 1.21x faster |
+-----------+------------+----------------------+

Patch to compile the pattern as part of class body

diff --git a/toml/encoder.py b/toml/encoder.py
index bf17a72..1b74431 100644
--- a/toml/encoder.py
+++ b/toml/encoder.py
@@ -128,6 +128,8 @@ def _dump_time(v):
 
 class TomlEncoder(object):
 
+    section_pattern = re.compile(r'^[A-Za-z0-9_-]+$')
+
     def __init__(self, _dict=dict, preserve=False):
         self._dict = _dict
         self.preserve = preserve
@@ -185,10 +187,11 @@ class TomlEncoder(object):
             sup += '.'
         retdict = self._dict()
         arraystr = ""
+        section_pattern = self.section_pattern
         for section in o:
             section = unicode(section)
             qsection = section
-            if not re.match(r'^[A-Za-z0-9_-]+$', section):
+            if not section_pattern.match(section):
                 qsection = _dump_str(section)
             if not isinstance(o[section], dict):
                 arrayoftables = False

tirkarthi avatar Mar 15 '21 19:03 tirkarthi