sphinx-immaterial icon indicating copy to clipboard operation
sphinx-immaterial copied to clipboard

Merge in upstream changes from mkdocs-material

Open jbms opened this issue 1 year ago • 31 comments

This replaces the modified Sphinx search implementation with the lunr.js-based search implementation from the upstream mkdocs-material theme.

Maintaining the existing custom search was proving to be untenable. Using the upstream search implementation greatly reduces the size of the diff and should make it much easier to incorporate future changes.

The downsides are:

  • Search index is significantly bigger. However, individual pages no longer need to be fetched to display detailed search result info; consequently, the net effect may not be that large.
  • The Sphinx search implementation had special support for sphinx object descriptions which was especially helpful for API documentation.

Remaining work before merging this:

  • [x] Update documentation to describe new features/changes
  • [ ] Fix search index for incremental builds
  • Better integration of Sphinx "objects" into the lunr.js-based search:
    • [ ] typing the name of an API symbol often does not give that API symbol as the first result; we will need to fix that.
    • [ ] search results for Sphinx objects displayed specially (e.g. tagged with "Python function") in the old search implementation. We should add something similar to the new search implementation.

jbms avatar Apr 11 '24 04:04 jbms

nox > mypy 
sphinx_immaterial/search.py:89: error: Unsupported left operand type for /
("str")  [operator]
        output_path = app.outdir / "search" / "search_index.json"
                      ^~~~~~~~~~~~~~~~~~~~~

mypy on py v3.8 uses sphinx<7, so app.outdir is a str where it was changed to a pathlib.Path in v7

Or we could drop v3.8 support like sphinx did in v7


I'm having trouble with npm run build. I'm getting a "Illegal characters in path" error from rimraf:

npm run build    

> [email protected] build
> npm run clean && ts-node -T tools/build --optimize


> [email protected] clean
> rimraf sphinx_immaterial/{*.html,partials,.icons,LICENSE,bundles}

Error: Illegal characters in path.
    at pathArg (file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/path-arg.js:40:33)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/index.js:17:54
    at Array.map (<anonymous>)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/index.js:17:42
    at main (file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/bin.mjs:243:15)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/bin.mjs:251:5 {
  path: 'C:\\dev\\sphinx-immaterial\\sphinx_immaterial\\{*.html,partials,.icons,LICENSE,bundles}',
  code: 'EINVAL'
}

This is likely exclusive to Windows. Also I can't just switch to WSL Ubuntu because most tools run by npx or installed via npm still see WSL as Windows (accurately even if annoyingly so).

2bndy5 avatar Apr 11 '24 05:04 2bndy5

nox > mypy 
sphinx_immaterial/search.py:89: error: Unsupported left operand type for /
("str")  [operator]
        output_path = app.outdir / "search" / "search_index.json"
                      ^~~~~~~~~~~~~~~~~~~~~

mypy on py v3.8 uses sphinx<7, so app.outdir is a str where it was changed to a pathlib.Path in v7

Fixed.

Or we could drop v3.8 support like sphinx did in v7

I'm having trouble with npm run build. I'm getting a "Illegal characters in path" error from rimraf:

npm run build    

> [email protected] build
> npm run clean && ts-node -T tools/build --optimize


> [email protected] clean
> rimraf sphinx_immaterial/{*.html,partials,.icons,LICENSE,bundles}

Error: Illegal characters in path.
    at pathArg (file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/path-arg.js:40:33)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/index.js:17:54
    at Array.map (<anonymous>)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/index.js:17:42
    at main (file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/bin.mjs:243:15)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/bin.mjs:251:5 {
  path: 'C:\\dev\\sphinx-immaterial\\sphinx_immaterial\\{*.html,partials,.icons,LICENSE,bundles}',
  code: 'EINVAL'
}

This is likely exclusive to Windows. Also I can't just switch to WSL Ubuntu because most tools run by npx or installed via npm still see WSL as Windows (accurately even if annoyingly so).

I'm not sure what the cause of this is. The rimraf command didn't change although the rimraf version changed.

jbms avatar Apr 11 '24 06:04 jbms

I'm having trouble with npm run build. I'm getting a "Illegal characters in path" error from rimraf:

npm run build    

> [email protected] build
> npm run clean && ts-node -T tools/build --optimize


> [email protected] clean
> rimraf sphinx_immaterial/{*.html,partials,.icons,LICENSE,bundles}

Error: Illegal characters in path.
    at pathArg (file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/path-arg.js:40:33)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/index.js:17:54
    at Array.map (<anonymous>)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/index.js:17:42
    at main (file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/bin.mjs:243:15)
    at file:///C:/dev/sphinx-immaterial/node_modules/rimraf/dist/esm/bin.mjs:251:5 {
  path: 'C:\\dev\\sphinx-immaterial\\sphinx_immaterial\\{*.html,partials,.icons,LICENSE,bundles}',
  code: 'EINVAL'
}

This is likely exclusive to Windows. Also I can't just switch to WSL Ubuntu because most tools run by npx or installed via npm still see WSL as Windows (accurately even if annoyingly so).

I'm not sure what the cause of this is. The rimraf command didn't change although the rimraf version changed.

I think this is fixed now by adding --glob to rimraf.

jbms avatar Apr 11 '24 06:04 jbms

I think this is fixed now by adding --glob to rimraf.

yep confirmed. thanks!


I recently upgraded my primary system drive... So now I'm on node v21.5.0.

2bndy5 avatar Apr 11 '24 06:04 2bndy5

sphinx.errors.ExtensionError: Could not import extension sphinx_immaterial.search (exception: No module named 'sphinx_immaterial.mkdocs_compat.config')

RTD build showing this too. I think the setup.py needs tweaking for the "new" plugin from upstream. Maybe a recursive glob in here:

https://github.com/jbms/sphinx-immaterial/blob/076acb17e71e2ef0ee324284a85feb7c82be67f7/setup.py#L182

2bndy5 avatar Apr 11 '24 06:04 2bndy5

I think the setup.py needs tweaking

Nope. sphinx_immaterial/mkdocs_compat/config has no __init__.py

2bndy5 avatar Apr 11 '24 06:04 2bndy5

I hope you don't mind if I push some commits to your branch. I won't force push anything, so you keep squashing the history as you see fit.

2bndy5 avatar Apr 11 '24 06:04 2bndy5

I hope you don't mind if I push some commits to your branch. I won't force push anything, so you keep squashing the history as you see fit.

Yes please go ahead!

jbms avatar Apr 11 '24 07:04 jbms

I won't force push anything

I lied. I screwed up the typing on a commit I didn't run through mypy first...

2bndy5 avatar Apr 11 '24 07:04 2bndy5

I'm adding to the docs about new options for customizable icons (html_theme_options["icons"]["***"]). But I get this error from the new search.py extension when I add the line

.. themeconf:: search

the error complains

Handler <function _build_finished at 0x000001F72220AA20> for event 'build-finished' threw an exception (exception: 'CustomHTMLBuilder' object has no attribute 'globalcontext')

about this line https://github.com/jbms/sphinx-immaterial/blob/f8d5df34e8676be59a0480e9bfbceaefca84b99a/sphinx_immaterial/search.py#L54

full traceback

Traceback (most recent call last):
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\application.py", line 355, in build
    self.builder.build_update()
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\builders\__init__.py", line 293, in build_update
    self.build(to_build,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\builders\__init__.py", line 313, in build
    updated_docnames = set(self.read())
                           ^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\builders\__init__.py", line 420, in read
    self._read_serial(docnames)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\builders\__init__.py", line 441, in _read_serial
    self.read_doc(docname)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\builders\__init__.py", line 498, in read_doc
    publisher.publish()
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\core.py", line 234, in publish
    self.document = self.reader.read(self.source, self.parser,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\io.py", line 105, in read
    self.parse()
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\readers\__init__.py", line 76, in parse
    self.parser.parse(self.input, document)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\parsers.py", line 81, in parse
    self.statemachine.run(inputlines, document, inliner=self.inliner)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 169, in run
    results = StateMachineWS.run(self, input_lines, input_offset,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 3024, in text
    self.section(title.lstrip(), source, style, lineno + 1, messages)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 325, in section
    self.new_subsection(title, lineno, messages)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 391, in new_subsection        
    newabsoffset = self.nested_parse(
                   ^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 279, in nested_parse
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2785, in underline
    self.section(title, source, style, lineno - 1, messages)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 325, in section
    self.new_subsection(title, lineno, messages)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 391, in new_subsection        
    newabsoffset = self.nested_parse(
                   ^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 279, in nested_parse
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2357, in explicit_markup      
    self.explicit_list(blank_finish)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2382, in explicit_list        
    newline_offset, blank_finish = self.nested_list_parse(
                                   ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 316, in nested_list_parse     
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2660, in explicit_markup      
    nodelist, blank_finish = self.explicit_construct(match)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2367, in explicit_construct   
    return method(self, expmatch)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2104, in directive
    return self.run_directive(
           ^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2154, in run_directive        
    result = directive_instance.run()
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\domains\__init__.py", line 289, in run
    return super().run()
           ^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\apidoc\object_toc.py", line 262, in run
    nodes = orig_run(self)
            ^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\apidoc\object_toc.py", line 206, in run
    nodes = orig_run(self)
            ^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\directives\__init__.py", line 285, in run
    nested_parse_with_titles(self.state, self.content, contentnode, self.content_offset)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\util\nodes.py", line 333, in nested_parse_with_titles        
    return state.nested_parse(content, content_offset, node, match_titles=1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 279, in nested_parse
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2357, in explicit_markup      
    self.explicit_list(blank_finish)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2382, in explicit_list        
    newline_offset, blank_finish = self.nested_list_parse(
                                   ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 316, in nested_list_parse     
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2660, in explicit_markup      
    nodelist, blank_finish = self.explicit_construct(match)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2367, in explicit_construct   
    return method(self, expmatch)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2104, in directive
    return self.run_directive(
           ^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2154, in run_directive        
    result = directive_instance.run()
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\domains\__init__.py", line 289, in run
    return super().run()
           ^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\apidoc\object_toc.py", line 262, in run
    nodes = orig_run(self)
            ^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\apidoc\object_toc.py", line 206, in run
    nodes = orig_run(self)
            ^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\directives\__init__.py", line 285, in run
    nested_parse_with_titles(self.state, self.content, contentnode, self.content_offset)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\util\nodes.py", line 333, in nested_parse_with_titles        
    return state.nested_parse(content, content_offset, node, match_titles=1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 279, in nested_parse
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 1170, in indent
    elements = self.block_quote(indented, line_offset)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 1187, in block_quote
    self.nested_parse(blockquote_lines, line_offset, blockquote)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 279, in nested_parse
    state_machine.run(block, input_offset, memo=self.memo,
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 233, in run
    context, next_state, result = self.check_line(
                                  ^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\statemachine.py", line 445, in check_line
    return method(match, context, next_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 2716, in blank
    paragraph, literalnext = self.paragraph(
                             ^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 416, in paragraph
    textnodes, messages = self.inline_text(text, lineno)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 425, in inline_text
    nodes, messages = self.inliner.parse(text, lineno,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 649, in parse
    before, inlines, remaining, sysmessages = method(self, match,
                                              ^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 792, in interpreted_or_phrase_ref
    nodelist, messages = self.interpreted(rawsource, escaped, role,
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\docutils\parsers\rst\states.py", line 889, in interpreted
    nodes, messages2 = role_fn(role, rawsource, text, lineno, self)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\util\docutils.py", line 487, in __call__
    return self.run()
           ^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\inline_icons.py", line 70, in run
    load_svg_into_builder_env(
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\inline_icons.py", line 42, in load_svg_into_builder_env
    raise FileNotFoundError(
FileNotFoundError: material/search.svg not found in sphinx_immaterial_icon_path and not bundled with the theme

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\events.py", line 97, in emit
    results.append(listener.handler(self.app, *args))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\search.py", line 94, in _build_finished
    indexer = _make_indexer(app)
              ^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\search.py", line 63, in _make_indexer
    return search_plugin.SearchIndex(**dict(_get_search_config(app)))
                                            ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\search.py", line 31, in _get_search_config
    plugin.on_config(_ThemeConfig(app))
                     ^^^^^^^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\search.py", line 38, in __init__
    self.theme = _Theme(app)
                 ^^^^^^^^^^^
  File "C:\dev\sphinx-immaterial\sphinx_immaterial\search.py", line 54, in __init__
    self._jinja2_env.globals.update(builder.globalcontext)
                                    ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'CustomHTMLBuilder' object has no attribute 'globalcontext'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\cmd\build.py", line 298, in build_main
    app.build(args.force_all, args.filenames)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\application.py", line 363, in build
    self.events.emit('build-finished', err)
  File "C:\dev\sphinx-immaterial\.env\Lib\site-packages\sphinx\events.py", line 108, in emit
    raise ExtensionError(__("Handler %r for event %r threw an exception") %
sphinx.errors.ExtensionError: Handler <function _build_finished at 0x000001F72220AA20> for event 'build-finished' threw an exception (exception: 'CustomHTMLBuilder' object has no attribute 'globalcontext')

2bndy5 avatar Apr 11 '24 09:04 2bndy5

I noticed you fixed some lint issues in the search plugin (which I copied unmodified from upstream except for replacing mkdocs with mkdocs_compat) --- unless they are necessary bug fixes it would be better to revert those to make it easier to merge changes in the future.

Similarly we can just check for the HTML builder in our own search extension sphinx_immaterial/search.py rather than modifying the upstream plugin.

jbms avatar Apr 11 '24 16:04 jbms

The icon-related error was actually due to material/search.svg not being a valid icon path. The extra error was due to my search integration not aborting if an error has already occurred. That is now fixed.

jbms avatar Apr 11 '24 16:04 jbms

I noticed you fixed some lint issues in the search plugin (which I copied unmodified from upstream except for replacing mkdocs with mkdocs_compat)

I didn't notice it was a copied python src (not used to looking out for that I guess). I see you reverted the lint changes 👍🏼 . ~We should add exclusion rules for python tooling (like black, mypy, and pylint) to ignore those copied sources.~ I see you already added exclusion rules.

BTW, those linting changes came from advice from ruff (I no longer use pylint as my daily linter).


I cleaned up a couple of my commits in the branch history.

2bndy5 avatar Apr 11 '24 20:04 2bndy5

Locally, I've been playing with nav_adapt.py to experiment with page icons in the global ToC. It works if I hard-code the icon like so:

    def get_result(self) -> MkdocsNavEntry:
        return MkdocsNavEntry(
            title_text=cast(str, self._rendered_title_text),
            url=self._url,
            children=self._children,
            active=False,
            current=False,
            caption_only=False,
            meta={"icon": "material/emoticon-happy"},
            is_page=True,
        )

Where the is_page is only used in the nav-item.html https://github.com/jbms/sphinx-immaterial/blob/e439e1c78264ed0daf2299413a6d59b97242be8a/src/templates/partials/nav-item.html#L44-L45 https://github.com/jbms/sphinx-immaterial/blob/e439e1c78264ed0daf2299413a6d59b97242be8a/src/templates/partials/nav-item.html#L59-L60

But I'm having trouble catching the metadata (field or directive) during the _TocVisitor process.

2bndy5 avatar Apr 11 '24 20:04 2bndy5

The _TocVisitor only has access to what is extracted from sphinx.environment.adapters.toctree.TocTree. I think you can access all of the doctrees from the builder, though, and from there extract the metadata for a given page.

The mkdocs-material theme embeds each svg inline. Potentially we could instead leverage our inline_icon mechanism to avoid duplicating them. Then we could insert them via the icon_html field that I added (needed to support the object description icons)

jbms avatar Apr 11 '24 21:04 jbms

It would be nice if we could support both, but I understand the bloated HTML concern with upstream's approach. Maybe we should just continue tracking this feature in #56 and table it for now.

IIRC, I had a branch that attempted to do the :si-icon: approach, but the ToC captions were an obstacle (aside from the astext() + _inject_wbr() usage).

2bndy5 avatar Apr 11 '24 23:04 2bndy5

To be clear we could support specifying the icon in the metadata, and still use our inline icon implementation for it.

jbms avatar Apr 11 '24 23:04 jbms

I'm seeing a console warning from this line https://github.com/jbms/sphinx-immaterial/blob/a2213137cfbab3d6f6490f9ab18e8312c504a50e/src/templates/partials/javascripts/content.html#L43 on each page load:

Empty string passed to getElementById().

This is despite weather or not the page contains content tabs.

Looking at upstream's equivalent source I think this problematic code was removed/replaced 2 years ago. It doesn't currently seem to have an affect on expected behavior. My guess is that this is a incorrect result of the merge script.

2bndy5 avatar Apr 12 '24 08:04 2bndy5

Mermaid rendering is broken: https://sphinx-immaterial--338.org.readthedocs.build/en/338/mermaid_diagrams.html#using-flowcharts

I'm not sure what's going on there. The mermaid dist is still copied to outdir/_static/mermaid/ as it should be. https://github.com/jbms/sphinx-immaterial/blob/8bfa816f21f28e899b894becc78d96b1cfccdc70/src/templates/assets/javascripts/components/content/mermaid/index.ts#L78 Maybe config.base is malformed or unset now that the search imlpementation was overhauled?

2bndy5 avatar Apr 12 '24 10:04 2bndy5

I bumped mermaid to 10.7.0 (as is used in upstream). this fixed the mermaid problem and resolves #328.

I also ran npm audit fix which bumped a couple package versions in the lock file (both had tar in the name). I hope that's ok. I can revert the audit if that was unwanted.

2bndy5 avatar Apr 12 '24 11:04 2bndy5

Bumping dependencies is fine in general, the one tricky thing is that we have to be careful about cssnano/postcss-related deps because newer versions result in OOM (also applies to upstream theme) and I don't yet know of a solution. The only way I got this merge to work was to copy the lockfile from the upstream theme and then run npm install to integrate our additional deps.

jbms avatar Apr 12 '24 17:04 jbms

A few other things I've noticed (all of which do not bother me):

  • Enhanced tooltips are not applied to graphviz diagrams. This likely isn't a simple CSS fix that could be added to our _graphviz.scss because the placement of enhanced tooltips seems to be calculated by JS.

  • The hlist directive throws enhanced tooltip placement off terribly.

  • The newer navigation.prune feature doesn't seem to have an effect. Although, I thought we were already doing this by default.

  • The toc object icons are aligned to the top of multi-lined entries, whereas before they were vertically center-aligned in multi-line entries

    before after
    image image

2bndy5 avatar Apr 13 '24 08:04 2bndy5

Honestly, this works well enough for me as it is now.

I'm going to focus on the feat: upstream issues though.

I haven't fully wrapped my head around the ported search implementation. My main confusion comes from unfamiliarity with lunr.js (or any search implementation for that matter). But I can try to help if this PR gets too stale.

FWIW, I really don't use the search functionality on web sites unless I'm in a hurry or completely lost for more than 3-5 minutes. I typically find the right page and use Ctrl+F instead.

I think the usefulness of search may depend on a lot of factors (including how good the search results are!) but for API documentation in particular a common use case is that you want to find the documentation for a specific symbol, and for that search is very useful. Suppose I want to find out about the tensorstore_demo.IndexDomain class. Compare our old search results:

https://jbms.github.io/sphinx-immaterial/?q=IndexDomain

with the new search results:

https://sphinx-immaterial--338.org.readthedocs.build/en/338/?q=IndexDomain

As for how search works, basically:

  1. the upstream search plugin parses the HTML of each page to split it into sections.
  2. Each section is then treated as a separate "document" for the purpose of searching. All HTML tags except a few that are supported for the "rich search results" are stripped out. Each document has a number of fields in addition to the main text/html content: the currently-used fields are title and tags, I believe. (Tags refers to the upstream tag feature that we don't currently support.) Documents can also have an additional numeric "boost" metadata property that affects how it is ranked. Each section with its fields is then serialized directly as JSON without further processing.
  3. The client javascript reads the json containing all of the sections and builds a lunr.js search index. When doing a search, the results are ranked based on which field the terms matched against --- tags are the highest, then title, then main body. Then the additional boost property is also factored into the rank.
  4. After performing the search, multiple documents from the same page are then grouped together when displaying results.

To improve results I think we should:

  • Modify the section parsing to ensure sphinx object descriptions are treated as separate sections
  • Include some metadata in the search documents generated from sphinx object descriptions to indicate the object description type (e.g. "Python class"), as we did before, and find a way to display it nicely (e.g. as text or as some type of icon) in the results. (Kind of optional, but pretty helpful.)
  • Sphinx has a concept of search weighting for object description types. We should include that as the boost value for the sections corresponding to object descriptions. That allows, for example, object descriptions for individual function parameters to be downweighted in the results.

It is possible we may need other changes to make API documentation searching work well, but those things would be a start.

jbms avatar Apr 13 '24 17:04 jbms

Superb explanation! 🚀 Doing another code dive...

2bndy5 avatar Apr 13 '24 20:04 2bndy5

Modify the section parsing to ensure sphinx object descriptions are treated as separate sections

Just a gut reaction here: To avoid any deviation from the inherited search plugin, we could instead derive from it and make any API changes that way. https://github.com/jbms/sphinx-immaterial/blob/89076eb96341f8242470a04694a5b36986d75b21/sphinx_immaterial/search.py#L57-L58

To compensate for this complexity, our search.py module could be organized as a sub-package. Also, support for other features (like tags) would then be contained/scoped to search-specific implementations.

2bndy5 avatar Apr 13 '24 20:04 2bndy5

Modify the section parsing to ensure sphinx object descriptions are treated as separate sections

Just a gut reaction here: To avoid any deviation from the inherited search plugin, we could instead derive from it and make any API changes that way.

https://github.com/jbms/sphinx-immaterial/blob/89076eb96341f8242470a04694a5b36986d75b21/sphinx_immaterial/search.py#L57-L58

To compensate for this complexity, our search.py module could be organized as a sub-package. Also, support for other features (like tags) would then be contained/scoped to search-specific implementations.

Yeah potentially we could indeed monkey patch some of the classes (e.g. SearchIndex and Parser) rather than modifying the file itself --- once we know what modifications are needed we can figure out the best way to make them.

Note that the upstream search plugin is already used in a hacky way from sphinx_immaterial/search.py since it is intended as a mkdocs plugin, not a sphinx extension. In particular the only portion of the SearchPlugin class itself that we are using is on_config for filling in default values of the config. Then we use the SearchIndex class directly.

jbms avatar Apr 13 '24 21:04 jbms

Still investigating an adequate approach, so this post is just an organization of my thoughts/findings.

Sphinx has a concept of search weighting for object description types. We should include that as the boost value for the sections corresponding to object descriptions

I found that Sphinx aggregates a prio in search.IndexBuilder.get_objects(). Assuming prio meabs "priority", this prio is returned by <derived-Domain()>.get_objects() (ie here for python & here for cpp). Luckily, this prio is return in a tuple that includes the object ID as used in the HTML elements' id attribute.

To suplement this info into our SearchIndex.add_entry_from_context() results, we actually need to augment upstream's SearchIndex.create_entry_for_section(). Each entry's "boost" field can be set according to the object's id; I'm thinking 1 + (0.5 * prio). However, fetching the object description's ID is not straight forward for our apigen outputs.

  • For autodoc (& manual invocation of sphinx domain directives), the id is appended to the entry's "location" field (as a URL hash location).
    [
      {
        "location": "demo_api.html#test-py-module",
        "title": "<code>test_py_module</code>",
        "text": "omitted for brevity"
      },
    ]
    
  • For apigen, the id may not present in the "location" field because each documented member gets its own page, resolving to just the page URL without hash URI. This case is a stumper (at least as I'm writing this).
    [
      {
        "location": "python_apigen_generated/IndexDomain.size-aea1f878.html",
        "title": "tensorstore_demo.IndexDomain.size",
        "text": "omitted for brevity"
      },
    ]
    

It would be nicer to have a 1-to-1 map of HTML-to-doctree to properly & robustly set the boost appropriately for API descriptions.

Initially, I don't think we need to monkeypatch anything to achieve this. It could probably be done in a separate function that _html_page_context() calls. But here's where we have to consider performance. If we can extract the CSS classes of the API description signature, then we can avoid traversing all domains in the builder env and, for example, just focus on py domain if "py" is in the list of classes.


I'm partial to opening another PR that merges into this branch, so we don't have to play around with this branch's git history to keep my proposal's reviews/changes "clean". The remaining goals blocking this PR are not simple objectives.

PS - I also thought about leveraging upstream's tag plugin with Sphinx domains, but I don't see that being very feasible since tags are really specific to an entire page (not sections of a page).

2bndy5 avatar Apr 14 '24 05:04 2bndy5

I think we can map back to the sphinx object description info from the HTML in a similar way to how the old search_adapt.py does a similar mapping:

https://github.com/jbms/sphinx-immaterial/blob/63a80bb381f1673e827939e8dc8f032b943bb80a/sphinx_immaterial/search_adapt.py#L24

We can first iterate over all objects to compute a map of (html_path, anchorname) -> (domain, object_type, any other info) and then use that when processing the HTML. The builder has a method to get the html path from the docname.

jbms avatar Apr 14 '24 05:04 jbms

Note that most object descriptions do not result in a section based on the current section parsing logic in the search plugin --- we would have to change the parsing, or modify the HTML before feeding it to the parser, to fix that. With apigen we do get a section because there is a separate page, but I think otherwise we don't.

jbms avatar Apr 14 '24 05:04 jbms

Just as a quick hack, I added "dl" to the Parser.keep attr, and I get much better API results. 😆 They still need to be boosted though.

2bndy5 avatar Apr 14 '24 09:04 2bndy5