mkdocs-techdocs-core Optionally use material search

fixes: #192

see also: https://github.com/backstage/backstage/pull/25497

the search/search_index.json template behaves differently based on which search plugin is being used.

with this change, we allow users to use material theme's search for generating the index.

Differences:

	Mkdocs search	Material theme search
Document split	Both entire document and heading sections	Only headings
Content escaping	Yes, both title and text	No, only titles. Text still contains html tags
Tags support	No	Yes
Boost support	No	Yes

Jul 02 '24 11:07 alexef

(validated this works as expected in https://github.com/alexef/mkdocs-techdocs-core/blob/dev/src/core.py#L22, but I'm yet to validate that using custom settings like the proposed way in this PR will work)

Jul 02 '24 12:07 alexef

We are using it locally and search seems to work just like before (except that now we don't get duplicate results).

Updated README added a tiny unit test. If you can please have another look.

Jul 02 '24 14:07 alexef

Hello 👋

First of all thanks for the PR! We also have the need to boost individual pages in a TechDocs but we didn't implement this yet.

We tried using the material-mkdocs search plugin on our side and we had some issue with it. I'm curious on how you managed to get around these. With this PR, enabling the flag will result in Backstage search results containing raw HTML tags.

The text contains html tags (like <p> and <code>) which are used for rich text previews on Mkdocs Material.

We've seen also that when using Elasticsearch with Highlighting, the html tags are not taken into account (html tags can be truncated in the search result highlight and highlighting segments conflicts with the html tags).

How are you handling the display of search results in Backstage ?

Jul 09 '24 14:07 tcardonne

@tcardonne atm I'm using a custom document collator in backstage, which I put in between the default one (that downloads search-index.json from s3) and I strip the tags before they are sent to Opensearch. So my index doesn't contain html tags, and the search results (with highlighting) work just like before.

In https://github.com/backstage/backstage/pull/25497 I'm preparing a less intrusive method that would allow minimal changes to the code, and allow us to continue to use the DefaultTechDocsCollatorFactory collator (by introducing the techdocs transformer, adding tags and stripping html in there).

Jul 10 '24 06:07 alexef

up

Jul 18 '24 08:07 alexef