luceneutil icon indicating copy to clipboard operation
luceneutil copied to clipboard

Make ID assignment deterministic

Open zhaih opened this issue 2 years ago • 1 comments

Changes

  1. Changes the script that build the binary LFDs and added ID to each line
  2. Changes the way of assign IDs in LineFileDocs.java

zhaih avatar Jul 25 '22 01:07 zhaih

Aha! Now I see #186! EventuallyConsistentMikeException.

You're right -- using AtomicInteger means the id assignment is non-deterministic!

But couldn't we still make it deterministic, in the binary case, by knowing the idBase of each block, and then each thread indexes that block by incrementing its id locally?

And in the text case, I think we hold a lock while reading the file, and we could do a simple id++ (perhaps on a volatile int, though I think because the same lock is held by each thread that does the increment, we may not need the volatile) there?

mikemccand avatar Jul 25 '22 09:07 mikemccand

Thanks @zhaih!

Do we think this might mean we can turn back on the rearrange step and use multiple threads to build the "for deterministic searching" index?

Yes I hope this is enough. I'll try to revive some of the memory and do the experiment locally and let you know what happened over the weekend!

zhaih avatar Aug 04 '22 22:08 zhaih