sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

Migration notes (4.3.2 -> 4.4.0)

Open tylerjw opened this issue 3 years ago • 6 comments

The release of 4.4.0 broke our sphinx build and I haven't been able to figure out how to fix it. The error I'm getting is this:

Exception occurred:
  File "/usr/lib/python3.8/sre_parse.py", line 671, in _parse
    raise source.error("multiple repeat",
re.error: multiple repeat at position 46

I googled for it and looked through this repo but I was unable to find migration notes. Are those posted anywhere? Fixing the version of sphinx I depend on to 4.3.2 fixes this error.

tylerjw avatar Jan 17 '22 18:01 tylerjw

Here is the full traceback:

Traceback (most recent call last):
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/cmd/build.py", line 284, in build_main
    app.build(args.force_all, filenames)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/application.py", line 337, in build
    self.builder.build_update()
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/builders/__init__.py", line 294, in build_update
    self.build(to_build,
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/builders/__init__.py", line 358, in build
    self.write(docnames, list(updated_docnames), method)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/builders/__init__.py", line 532, in write
    self._write_serial(sorted(docnames))
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/builders/__init__.py", line 539, in _write_serial
    doctree = self.env.get_and_resolve_doctree(docname, self)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/environment/__init__.py", line 535, in get_and_resolve_doctree
    self.apply_post_transforms(doctree, docname)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/environment/__init__.py", line 581, in apply_post_transforms
    transformer.apply_transforms()
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/transforms/__init__.py", line 87, in apply_transforms
    super().apply_transforms()
  File "/usr/lib/python3/dist-packages/docutils/transforms/__init__.py", line 171, in apply_transforms
    transform.apply(**kwargs)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/transforms/post_transforms/__init__.py", line 43, in apply
    self.run(**kwargs)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/ext/extlinks.py", line 59, in run
    self.check_uri(refnode)
  File "/home/tyler/.local/lib/python3.8/site-packages/sphinx/ext/extlinks.py", line 72, in check_uri
    uri_pattern = re.compile(base_uri.replace('%s', '(?P<value>.+)'))
  File "/usr/lib/python3.8/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.8/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.8/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.8/sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.8/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/lib/python3.8/sre_parse.py", line 671, in _parse
    raise source.error("multiple repeat",
re.error: multiple repeat at position 46

tylerjw avatar Jan 17 '22 19:01 tylerjw

That line was introduced in https://github.com/sphinx-doc/sphinx/commit/8356260554fee9ebd26a2c11cdf039af36cd951e, part of #9800. My guess is that base_uri has some unescaped characters?

I can reproduce the error like this:

>>> import re
>>> base_uri = "https://example.com/read-c++-tutorial/"  # notice double +
>>> uri_pattern = re.compile(base_uri.replace('%s', '(?P<value>.+)'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.8/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.8/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.8/sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.8/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/lib/python3.8/sre_parse.py", line 671, in _parse
    raise source.error("multiple repeat",
re.error: multiple repeat at position 27

@tylerjw Could you add some print statements or something that allow us to understand what the base_uri is right before the crash?

astrojuanlu avatar Jan 18 '22 17:01 astrojuanlu

Adding prints this is the base_uri causing the problem:

http://docs.ros.org/noetic/api/tf2_ros/html/c++/classtf2__ros_1_1%s.html

Thank you for your help on this. I should have known to do this debugging step. We have this extlinks entry:

"tf2": (
        "http://docs.ros.org/"
        + ros1_distro
        + "/api/tf2_ros/html/c++/classtf2__ros_1_1%s.html",
        "",
    ),

Shouldn't the %s be the only escaped character and the rest of this treated like a raw string and now a regular expression? My understanding is the + character has meaning in regular expressions and isn't being escaped properly for this line.

tylerjw avatar Jan 18 '22 18:01 tylerjw

Do I need to use URL escape characters for the + or regular expression escape characters? Is there a pythonic way of doing this so I can have all my links escaped correctly for me?

tylerjw avatar Jan 18 '22 18:01 tylerjw

Was this solved?

felixvd avatar Aug 02 '22 07:08 felixvd

@felixvd I never figured this out as you can see in our requirements.txt we are still using v4.3.2 and the string I posted above is still in our conf.py.

tylerjw avatar Aug 02 '22 13:08 tylerjw