Search results don't link correctly if the url contains a "#"
Describe the bug
While the standard linking correctly encodes the url, doing a search for a file with a # in the url fails.
For example, if I have a file at src\languages\C#\csharp.md then the normal links are to src/languages/C%23/csharp.html but a search result that includes that file links to src/languages/C#/csharp.html.
How to Reproduce
Create a file with a # in the name, and something inside it that can be searched for.
Generate the docs.
Search for the text, and click the link.
Get a 404.
Environment Information
Platform: win32; (Windows-10-10.0.26200-SP0)
Python version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)])
Python implementation: CPython
Sphinx version: 7.4.7
Docutils version: 0.21.2
Jinja2 version: 3.1.6
Pygments version: 2.19.2
Sphinx extensions
myst_parser, sphinx_tabs.tabs
Additional context
N/A
@andrewducker do you know of any live URLs on the web where an # symbol is used within the URL path, or is this primarily a request to cater to local file resources?
(I realise that this specific issue is about the encoding of those characters within the search result listings -- even so, I'm curious and there might be something tangentially related to consider)
This happened to us with our internal documentation. With exactly the example given above.
(We write a lot of our code in C# - and up to now haven't noticed an issue with the folder being called that - because the actual links escape the #)
It's not being accessed from a file resource though, it's on an internal web server.
Instead of #14058, could we just do:
Index: sphinx/search/__init__.py
===================================================================
diff --git a/sphinx/search/__init__.py b/sphinx/search/__init__.py
--- a/sphinx/search/__init__.py (revision 9ca942bef6cbebf7aaf999ed81e49fca336a1d85)
+++ b/sphinx/search/__init__.py (date 1764040661280)
@@ -9,6 +9,7 @@
import os
import pickle
import re
+import urllib.parse
from importlib import import_module
from typing import TYPE_CHECKING
@@ -428,6 +429,7 @@
def freeze(self) -> dict[str, Any]:
"""Create a usable data structure for serializing."""
docnames, titles = zip(*sorted(self._titles.items()), strict=True)
+ quoted_docnames = tuple(map(urllib.parse.quote, docnames))
filenames = [self._filenames.get(docname) for docname in docnames]
fn2index = {f: i for (i, f) in enumerate(docnames)}
terms, title_terms = self.get_terms(fn2index)
@@ -451,7 +453,7 @@
))
return {
- 'docnames': docnames,
+ 'docnames': quoted_docnames,
'filenames': filenames,
'titles': titles,
'terms': terms,
Does this fix your problem @andrewducker?
A