hypergrep icon indicating copy to clipboard operation
hypergrep copied to clipboard

Update benchmark with ugrep 5.0

Open genivia-inc opened this issue 10 months ago • 5 comments

Nice project!

I came across it because a ugrep user found a Reddit post about the project. Thanks for including ugrep in the benchmarks. It is always interesting how ugrep can be used in comparison. If there are some bottlenecks to speed, then I put it on the TODO list to resolve (see my note below).

The ugrep version used in these benchmark isn't up to date. More optimal search methods and optimizations were implemented in ugrep 3.12 and continued with 4.0. You may want to run the benchmarks with the latest ugrep e.g. v4.0.5. The old version 3.11 had an issue picking the wrong search algorithm in some cases, leading to a fall back to the default method that is slower.

Note that there are also plans for enhancing ugrep's search speed further with some tricks other grep use, as well as adding a new method to deal with pattern cases that ugrep doesn't yet handle well (long initial wildcard characters like .*foo that's not optimized at all but is easy to do by looking for foo), but these plans were (and some still are) delayed to give priority to new and improved features while ensuring reliability. I won't give a list of things that were added over time (since this is not about ugrep, but about hypergrep). Simply getting high performance with lots of new features is a tricky combination to do at the same time.

People always ask about new features or to improve them, they care a bit less about speed. Most often if they really want speed, then the its usually to search "cold" file systems. Then qgrep might be a good choice. Or perhaps ugrep-indexer.

I am curious, is hypergrep based on hyperscan? EDIT: sorry, dumb question, I see the answer in the README up front now. I wanted to ask, because hypergrep has a "simple grep" example that works OK, but not stellar.

genivia-inc avatar Aug 28 '23 02:08 genivia-inc