obsidian-omnisearch icon indicating copy to clipboard operation
obsidian-omnisearch copied to clipboard

[BUG] WHen I search URLS I get the right file result but the highlight is always something else...

Open tracure1337 opened this issue 2 years ago • 8 comments

Problem description:

After obsidian destroyed their search with the new colon option I use omnisearch to be able to search for my urls in my knowledgebase. I know, very very edge case.

When I search URLS I get the right file result but the highlight is always something else...

Pressing enter opens the right file but does not put the cursor at the search result.

image

Your environment: Latest Obsidian Latest Omnisearch Latest MacOS / Latest LINUX

tracure1337 avatar Oct 17 '23 14:10 tracure1337

Omnisearch does a token-based search, and the url https://foo.bar/baz is actually read as https foo bar baz. The highlighting reflects that.

Though I understand how it can be unexpected with this kind of usecase and will look to rework the highlight and "go to line"

(Might be related to #301 too)

scambier avatar Oct 17 '23 14:10 scambier

Yes, I also have the same issue like #301

Maybe optionally a list of regex that will always be searched as full string.? So I could declared urls to always search not-token-based.

tracure1337 avatar Oct 18 '23 07:10 tracure1337

much love @scambier

tracure1337 avatar Nov 01 '23 15:11 tracure1337

This is now deployed in the 1.19.0-beta.1 version (if you use BRAT). I'll probably push it on the main branch within 2 weeks. You'll also need to clear your Omnisearch cache for a full reindex.

scambier avatar Nov 02 '23 17:11 scambier

@scambier Could it be that Obsidian changed smth? I cannot query for urls again...just noticed that right now. I cleaed the cache and restarted and made sure I have all updates in.

tracure1337 avatar Nov 25 '23 18:11 tracure1337

Nope, no change since the update that included this feature 🤔

image

scambier avatar Nov 25 '23 18:11 scambier

Okay, it seems to work most of the time but still not everytime.

see this example. What could the reason be? Would there be a way to add more weight to full matches? especially with urls?

image

tracure1337 avatar Nov 30 '23 07:11 tracure1337

Still having these issues. Thanks for your time :-)

tracure1337 avatar Jan 28 '24 07:01 tracure1337

hi, i also have this issue: query: "normie.cc" part of markdown file with match:

# contents
- main homepage with social icons + something custom
	- https://kasper.space cool starry animated bg
	- https://thedise.me nice background, social icons
	- https://www.normie.cc cool greeting font
	- [tinyclouds](https://tinyclouds.org) cool social icons

omnisearch: image image image

the highlights works somewhat properly when using in-file search though? image image

KraXen72 avatar Feb 21 '24 17:02 KraXen72

I think the solution would be an option for an ignorelist.

Whenever I remove the https scheme for example it works. I personally would just add https://github.com to the ignorelist so the search always only focusses on the remainder of the path.

example:

https://github.com/scambier/obsidian-omnisearch --> only searches for scambier/obsidian-omnisearch

@scambier Please please stop our suffering :-)

tracure1337 avatar Mar 24 '24 07:03 tracure1337

i think a better solution might be rewriting the highlighting to support highlighting several matched tokens

KraXen72 avatar Mar 24 '24 07:03 KraXen72

I've published a version 1.22.0-beta.1 last week that should improve the general highlighting.

scambier avatar Mar 24 '24 08:03 scambier

Just tried it out. Unfortunately no improvement so far. I deleted the cache and restarted + reindexed as well.
image

example :

image

image

Thank you for your continued pursuit of this issues. Really highly appreciated.

Maybe if it could just ignore a list of schemes?

tracure1337 avatar Mar 24 '24 19:03 tracure1337

i from how omnisearch (and under the hood, minisearch) works, the whole query gets split into terms - many files have some of the terms from that url, but it happens so that the file with the full url has most terms, so it's at the top. the problem is, not all the terms are necessarily found one after another in the file, leading to the highlighting you see.

i think it could be worth exploring detecting urls in the search query, and adding custom highlighting logic if the query contains a url, to try to highlight most of it without interruptions. it also could be useful to see if minisearch has some options which are relevant to this problem. (something like a locality or continuity bonus)

KraXen72 avatar Mar 24 '24 20:03 KraXen72

I published 1.22.0-beta.2.

Urls are now considered whole tokens so that should greatly improve results. I also reworked the highlighting a bit.

scambier avatar Mar 26 '24 20:03 scambier

just updated and will report back how it works out. So far it has improved for sure.

tracure1337 avatar Apr 04 '24 22:04 tracure1337

So far it works GREAT!!!!
The highlighting is a little off and weird sometimes but absolutely no problem at all. I can show a few examples over time if you want me to. And I think it still has to do with too many results for the protocol scheme. So still somehow assuming it would be cool to be able to ignore any protocol://

But anyways. Thank you for taking care of this!!! HIGHLY appreciated.

tracure1337 avatar Apr 06 '24 11:04 tracure1337

Neat :)

I can show a few examples over time if you want me to.

If you happen to have a minimal reproducible example (a note + a search query), I'll gladly take it 👍

scambier avatar Apr 07 '24 12:04 scambier

Hello, due to #363, I've gated this feature behind an opt-in setting in the Behavior section

scambier avatar Apr 15 '24 05:04 scambier

Today I noticed ;-) All good.

I did not find any more reproducible examples.

I feel this "issue" is truly solved. Thanks again @scambier!

tracure1337 avatar May 08 '24 15:05 tracure1337

Thanks for your feedback o/

scambier avatar May 08 '24 16:05 scambier