sphinx
sphinx copied to clipboard
search: support searching for (sub)titles
Collect all titles from all pages and utilize a full match (case insensitive) in Search page.
Fixes: #10689
Does this work for partial matches?
A
Does this work for partial matches?
Yes, I've just added partial match support.
Please add tests and a CHANGES entry.
A
I've just added CHANGES, but for a working test, I would need an HTML inspection of rendered search.html page.
Do we have such tests?
As an improvement, it should be possible to emit a link directly to the title in a page.
But I can't find a function that would give me for a nodes.title Node an anchor:
{'rawsource': 'Demo documentation', 'children': [<#text: 'Demo documentation'>], 'attributes': {'ids': [], 'classes': [], 'names': [], 'dupnames': [], 'backrefs': []}, 'tagname': 'title', 'parent': <section "demo documentation": <title...><compound...><paragraph...><substitution_defin ...>, '_document': <document: <substitution_definition "gol"...><section "demo documen ...>, 'source': '/home/marxin/Programming/texi2rst-generated/sphinx/demo/index.rst', 'line': 2}
Can you please help me?
I think you may need to look at the parent section element to find the ids.
Note: In https://github.com/jbms/sphinx-immaterial I have implemented something similar entirely client side, but there are a few differences:
- The sub-sections are parsed from the HTML document itself, while extracting snippets.
- If the search text is found within the page, the result link is to the nearest containing section.
This is the source code of my implementation, for reference: https://github.com/jbms/sphinx-immaterial/blob/main/src/assets/javascripts/sphinx_search.ts
In general handling this when building the index is probably better, though given that the HTML must be parsed anyway to handle the snippets I'm not sure.
In general handling this when building the index is probably better, though given that the HTML must be parsed anyway to handle the snippets I'm not sure.
Yes, I do prefer the server side implementation and I'm still curious about the title links as mentioned in my previous comment. One should be able to get a link to them.
One should be able to get a link to them.
Something like node.parent["names"] should work? The anchor link is on the docutils.nodes.section node as I recall.
A
Something like
node.parent["names"]should work? The anchor link is on thedocutils.nodes.sectionnode as I recall.
Yep, that almost works:
diff --git a/sphinx/search/__init__.py b/sphinx/search/__init__.py
index bbb28c0b9..7916d26f0 100644
--- a/sphinx/search/__init__.py
+++ b/sphinx/search/__init__.py
@@ -216,6 +216,8 @@ class WordCollector(nodes.NodeVisitor):
elif isinstance(node, nodes.title):
title = node.astext()
self.found_titles.append(title)
+ print('node:', node)
+ print('node.parent[names]:', node.parent['names'])
self.found_title_words.extend(self.lang.split(title))
elif isinstance(node, Element) and self.is_meta_keywords(node):
keywords = node['content']
emits something like:
node: <title>Comparison of GCC docs in Texinfo and Sphinx</title>
node.parent[names]: ['comparison of gcc docs in texinfo and sphinx']
node: <title>HTML output</title>
node.parent[names]: ['html output']
node: <title>Formatting</title>
node.parent[names]: ['formatting']
So the last missing piece is probably an escaping that will emit e.g. comparison-of-gcc-docs-in-texinfo-and-sphinx?
Ahh, can you try ["ids"]?
A
Ahh, can you try
["ids"]?
Works for me, added that.
Can you please @AA-Turner review the pull request now?
Something seems to be wrong:
https://sphinx--10717.org.readthedocs.build/en/10717/search.html?q=More+topics+to+be+covered
https://www.sphinx-doc.org/en/master/search.html?q=More+topics+to+be+covered
The PR only shows 5 results, and doesn't highlight the title, whereas the current master shows the title, albeit as the third result.
A
The PR only shows 5 results, and doesn't highlight the title, whereas the current master shows the title, albeit as the third result.
Yeah, it's a fancy feature of Read the Docs, it must be a plug-in that is used for Sphinx docs.
Please compare it with another pull request: https://sphinx--10807.org.readthedocs.build/en/10807/search.html?q=More+topics+to+be+covered
Rebased.
A
If running an incremental build, searchindex.js is only updated after loading it, so we need to bump the environment version.
A
Thanks @marxin!
A
Thanks for merging that!