Robert Muir
Robert Muir
Here's a first stab of what i proposed on https://github.com/apache/lucene/pull/692 You can see how damaging the current cost() implementation is. As followup commits we can add the `grow(long)` sugar that...
If we want to add the `grow(long)` sugar method that simply truncates to `Integer.MAX_VALUE` and clean up all the points callsites, or write a cool FixedBitSet.approximateCardinality, just feel free to...
> I don't think the grow(long) is necessary, we can always added to the IntersectVisitor instead. Maybe would be worthy to adjust how we call grow() in BKDReader#addAll as it...
There's no way we're allowing more than `Integer.MAX_VALUE` calls going to this buffers thing.
seriously, look at `threshold`. its `maxDoc >>> 7`. `maxDoc` is an int. when you call `grow(anywhere close to Integer.MAX_VALUE)` then buffers exits the stage permanently. 64 bits are not needed.
if we add `grow(long)` that simply truncates and forwards, then it encapsulates this within this class. The code stays simple and the caller doesn't need to know about it.
@iverase @jpountz I "undrafted" the PR and added a commit with the `grow(long)` that just truncates-n-forwards. It seems like the best compromise based on discussion above. I also made some...
For the record this DocIdSetBuilder.Buffer has been so damaging to our code, insanely, I'm still here trying to calm down the explosion of horribleness caused by it. I opened https://issues.apache.org/jira/browse/LUCENE-10443...
I reverted adding helper `grow(long)`. I won't be the one bringing 64 bits into this API. It builds DocId *Sets*. It is an implementation-detail that for small sets it may...
this will actually slow down the merge heavily, by preventing things like optimized bulk merges of stored fields. I really don't think we should be doing this with a codec-wrapper....