Viktor
Viktor
The search engine fingerprints the webserver to try to figure out what sort of a website it is. This is done by looking at the meta generator tag and various...
It would be a very nice productivity enhancement if StaticResources could do a hot reload, or at least be made to do a hot reload in test when developing locally....
In `SearchApiQueryService` change ``` profile = switch (index) { case "0" -> SearchProfile.YOLO; case "1" -> SearchProfile.MODERN; case "2" -> SearchProfile.DEFAULT; case "3" -> SearchProfile.CORPO_CLEAN; default -> SearchProfile.CORPO_CLEAN; }; ```...
Issues: * quick sort uses an inclusive upper bound. There's also improvements to the algorithm itself. It's the most basic-ass quicksort algorithm right now, better variants exist and might speed...
These are presently available internally in `ResultRankingParameters`, most are probably safe to expose. ``` public final Bm25Parameters fullParams; /** Tuning for BM25 when applied to priority matches, terms with relevance...
Marginalia uses Porter stemming quite frequently in various applications. It's worth examining options. Porter's algorithm is pretty janky, and considers e.g. 'universe' and 'university' to be of the same stem....
[features-convert/summary-extraction](https://github.com/MarginaliaSearch/MarginaliaSearch/tree/master/code/features-convert/summary-extraction), that is, the logic that extracts a summary of each document is very inconsistent. This is a pretty difficult problem with two steps: 1. Find "the text" of the...
Hi, I [forked the parquet-floor](https://github.com/MarginaliaSearch/MarginaliaSearch/tree/master/third-party/parquet-floor) repository for use in Marginalia Search, and hacked in support for repeated values (both java.util.List and Trove Lists, but the second is due to my...
To provide the result ranking algorithm with more accurate information, and enable full "quoted search terms"-support, this change removes the position bitmask and replaces it with a gamma-coded sequence of...
In https://search.marginalia.nu/site/brainbaking.com, the website link and preview URLs are all for git.brainbaking.com. This seems decidedly off. Additionally several of the documents have broken summaries with goatcounter HTML.