Michael McCandless

Results 350 comments of Michael McCandless

> > I'm still not happy with the 2-3 K QPS :) Something seems amiss > > I suspect it's because the only scoring query produces a constant score. So...

> > The benchmark writes detailed "results" files for each iteration -- can you peek at those and confirm your two forms of the same query are in fact getting...

These gains indeed look correct (identical hit counts from X and XOpt tasks) and significant, especially for the N-term OR cases! Thanks @shubhamvishu. I thought we had a Lucene issue...

Actually, I'm not sure how the `OrNegatedNTerms` rewrite case would work? It's a strange query e.g. `q=(-body:eric *:*) (-body:kansas *:*)`. I'm not sure this really happens in practice very often?...

> Note : I tried writing count(-most -september *:*) task as count(-(most september) *:*) but that seem to do what we want and results in 0 results. so I had...

I like that idea @arafalov -- that would give us a nice initial tokenization, and the deep metadata (class name, method name, a class being subclassed, etc.) could enable awesome...

Looks like [the `com.sun.source.tree` package](https://docs.oracle.com/en/java/javase/21/docs/api/jdk.compiler/com/sun/source/tree/package-summary.html) has all the juicy stuff.

It's the last `boolean` argument to `LocalTaskSource` ctor which groups all tasks by category together (running them all sequentially within each thread) when concurrency is enabled: https://github.com/mikemccand/luceneutil/commit/87a806341b008e959376ab0f1c8cfc0997a07d7a

> Grouping was added here: #56 because without it the QPS measurement per task was meaningless OH! Thanks for digging @msokolov ... hmm I think this is why we have...

> I guess we could switch from measuring wall clock time to measuring CPU time using [Java's JMX API](https://docs.oracle.com/cd/E17802_01/j2se/j2se/1.5.0/jcp/beta1/apidiffs/java/lang/management/ThreadMBean.html#getThreadCpuTime). If we did that we wouldn't need to group tasks this...