Michael McCandless

Results 216 comments of Michael McCandless

Hi, can you described what change you are suggesting here?

Maybe the geo benchmarks? They use doc values for computing distance, sorting? I think it's also possible to turn on `SortedSetDVFacets`. But I don't think we have any large (`BINARY`?)...

Thanks @balmukundblr this looks great! Could you please open a new PR on the new Lucene GitHub repo? https://github.com/apache/lucene Thanks!

Another few, not sure if these also fail on mainline (though prolly we have seed shifting?): ``` [junit4:pickseed] Seed property 'tests.seed' already defined: B87E3065EF9405AA [junit4] says ᐊᐃ! Master seed: B87E3065EF9405AA...

I now understand @rmuir 's concern: because today we force sum of term freq within a single document to fit in `int` (during this `invertState.length` accumulation for norms), and because...

> > > Hmm, but I think sumTotalTermFreq, which is per field sum of all totalTermFreq across all terms in that field, could overflow long even today, in and adversarial...

Hmm, I see this [src fix was committed, but the new unit test was not committed](https://github.com/apache/lucene/commit/49631ace9f1ee110d52a207377e4926baef74929) -- was that intentional?

Whoa, thanks @uschindler -- this looks awesome. > @mikemccand can you test this with JDK 15 (release candidate) and your test. You should not see any locks anymore, speed should...

These are awesome questions about segment replication! This is indeed a challenging situation for segrep, but is solvable with one of the three proposed options. Lucene is fundamentally a write-once...