python-statemachine
python-statemachine copied to clipboard
Building documentation fails in some directory configurations
Following the steps described in the "Get Started" section of the contribution page, up until step 9 (uv run sphinx-build docs docs/_build/html), an error is raised:
Configuration error:
There is a programmable error in your configuration file:
Traceback (most recent call last):
File "C:\Projects\python-statemachine\.venv\Lib\site-packages\sphinx\config.py", line 530, in eval_config_file
exec(code, namespace) # NoQA: S102
^^^^^^^^^^^^^^^^^^^^^
File "C:\Projects\python-statemachine\docs\conf.py", line 287, in <module>
"image_scrapers": (MachineScraper(project_root),),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Projects\python-statemachine\tests\scrape_images.py", line 17, in __init__
self.re_machine_module_name = re.compile(r"C:\Projects\python-statemachine/(.*).py$")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python312\Lib\re\__init__.py", line 228, in compile
return _compile(pattern, flags)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python312\Lib\re\__init__.py", line 307, in _compile
p = _compiler.compile(pattern, flags)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python312\Lib\re\_compiler.py", line 745, in compile
p = _parser.parse(p, flags)
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python312\Lib\re\_parser.py", line 979, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
File "C:\Python312\Lib\re\_parser.py", line 460, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python312\Lib\re\_parser.py", line 544, in _parse
code = _escape(source, this, state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python312\Lib\re\_parser.py", line 443, in _escape
raise source.error("bad escape %s" % escape, len(escape))
re.error: bad escape \P at position 2```
I have traced this down to `MachineScraper` which compiles a regular expression using `re.compile(f"{self.project_root}/(.*).py$")`. The issue appears to be that on Windows, the path separator is `\` rather than `/` and so the next character is interpreted as an escape sequence regardless of whether or not it is valid sequence.
Thanks a lot for reporting this @OliverDavey, you’re absolutely right, the issue stems from how backslashes in Windows paths are interpreted in regex patterns (e.g., \P becomes an invalid escape sequence).
To fix it, we can safely escape the project_root before embedding it into a regex by using re.escape(), like so:
escaped_root = re.escape(os.path.abspath(project_root))
self.re_machine_module_name = re.compile(rf"{escaped_root}[\\/](.*)\.py$")
This makes the regex compatible across platforms, allowing both / and \ as path separators.
Tagging @fgmacedo here! Happy to help put together a PR with this change (and maybe add a quick test for it as well) if that sounds good to you!
Thanks @OliverDavey for reporting this.
@rafaelrds I agree that using re.escape is a more robust solution. Please go ahead!
@fgmacedo I've amended my PR #527 to use re.escape as recommended above
Thanks @OliverDavey !