Michael McCandless comments

Results 216 comments of


                                            Michael McCandless

GITHUB#11795: Add FilterDirectory to track write amplification factor

> Also @dsmiley that's an interesting suggestion! I'm not as familiar with Lucene as some of the other people commenting here but I would be open to adding this to...

GITHUB#11795: Add FilterDirectory to track write amplification factor

> I'm considering exposing write amplification separately for flushes (as `flushedBytes / totalIndexSize`), merges (as `(totalIndexSize + mergedBytes) / totalIndexSize`) and temporary files (as `(totalIndexSize + tempBytes) / totalIndexSize`) and...

GITHUB#11795: Add FilterDirectory to track write amplification factor

This looks great to me! I love all the engagement (83+ comments!) and how it iterated to such a simple solution. I left a small comment for a follow-on issue...

GITHUB#11795: Add FilterDirectory to track write amplification factor

Thanks @mdmarshmallow! Sorry for the delay merging ... I will backport to 9.x then let's get this in nightly benchmarks :)

GITHUB#11795: Add FilterDirectory to track write amplification factor

9.x backport done: https://github.com/apache/lucene/commit/373d2e84c13ee67e8e1247338e69b53946b7f726

Release manager should review lucene benchmarks before building release candidates

> yup. Possibly too if Mike M is bored he could implement an alarming system :) or export the data somehow so we could bolt one on the side? Actually...

Release manager should review lucene benchmarks before building release candidates

This was a spinoff from #11824.

GH#11601: Add ability to compute reader states after refresh

> 3\. Allow the user to update the ordinal maps in the reader states they already have without requiring them to completely recreate the reader states. I’m not sure how...

Optimize top-k counting for approximate queries

Hmm, why is `Self time` so high in your profiler output? What is `countHits` actually doing? Is there any way to cluster multiple hashes into a single term during indexing?...

Optimize top-k counting for approximate queries

> Hi @mikemccand, thanks for the reply. As a side note, I've found many of your articles very helpful! Thanks, I am glad to hear that :) The `countHits` method...