Michael McCandless
Michael McCandless
Soon Lucene will make the "big switch" from Jira to GitHub issues. We must also fix (and perhaps rename!!??) https://jirasearch.mikemccandles.com `indexJira.py` tool to also cutover to GitHub's export API.
Today in `luceneserver`, the transferring of shards is entirely node to node communication. This is fast/efficient, but, it does maybe mean in a deep replica case that the primary shard...
I think as luceneserver works today, a single node always does both indexing and searching. For default cluster that is probably OK/best? But for better isolation we should allow scheduling...
I think I started this, long ago ... and the idea was not very well thought out, so let's try to figure out design, here: The idea was to (by...
@jpountz created an [awesome benchmark to measure `OrdinalMap` construction time](https://github.com/mikemccand/luceneutil/commit/a8eeff092582bc068f391bb56b72669949065fe3). Let's turn this on in nightly benchmarks? It could/should be nearly the same thing we did recently to enable stored...
@rmuir observed in [this issue](https://issues.apache.org/jira/browse/LUCENE-10250) that Wikipedia already has labels/categories per page, and these labels have sub-categories, etc. This would be another (in addition to the high cardinality but flat...
I have noticed recently that, despite nightly benchmarks kicking off at 11 PM my time (EST), they are still running when I get up at ~5 or 6 AM. This...
I heard rumors that Mark Miller is working out how to use async profiler and `perfasm` to profile Lucene/Solr! Let's make this easy in `luceneutil` too? But I don't know...
I can't believe we don't already have this, but we seem not to plot the size of the (`enwiki`) nightly Lucene index! We do for `geospatial` (NYC taxis) benchmarks ......
I ran six re-indexing benchmarks using `wikimediumall`, and each time, got a different total number of documents indexed: ``` [mike@beast3 trunk]$ grep "indexing done" /l/logs/trunk?.txt /l/logs/trunk1.txt:Indexer: indexing done (89114 msec);...