tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

Investigate empty segment merge

Open PSeitz opened this issue 4 years ago • 2 comments

An issue occurred where merge after commit selected only empty segments.

thread 'merge_thread1' panicked at 'Unexpected error, empty readers in IndexMerger', src/indexer/merger.rs:330:16
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'merge_thread1' panicked at 'You forgot to flush "00004481000000000000000000000000.term" before its writter got Drop. Do not rely on drop. This also occurs when the indexer crashed, so you may want to
 check the logs for the root cause.', src/directory/ram_directory.rs:49:13
stack backtrace:

Investigate if checking for non-empty segments before triggering a merge is sufficient. There are also different possible scenarios, when we end up with an empty segment:

  • fresh new segments ends up with no docs after processing the deletes
  • old segments ends up with no docs after processing deletes
  • merged segment ends up with no docs after processing the deletes

To Reproduce

Change test_functional_indexing_sorted to run with 15threads

let mut index_writer = index.writer_with_num_threads(15, 150_000_000)?;

Maybe run in a loop or multiple times.

PSeitz avatar Oct 29 '21 09:10 PSeitz

@PSeitz I don't remember this ... Is this something you can reproduce relatively easily?

fulmicoton avatar Jul 27 '22 01:07 fulmicoton

Yes, it's easy to reproduce, but I don't get the stacktrace anymore. cargo nextest seems to behave more consistent there.

NUM_FUNCTIONAL_TEST_ITERATIONS=2000000 cargo test indexing_sorted  -- --ignored

running 1 test
error: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: `/home/pascal/LinuxData/Development/tantivy/gcd_encoding/target/debug/deps/tantivy-016729acb4e831a2 indexing_sorted --ignored` (signal: 6, SIGABRT: proc
ess abort signal)
NUM_FUNCTIONAL_TEST_ITERATIONS=2000000 cargo nextest run indexing_sorted --run-ignored all
   Compiling tantivy v0.18.0 (/home/pascal/LinuxData/Development/tantivy/gcd_encoding)
    Finished test [unoptimized + debuginfo] target(s) in 14.07s
  Executable unittests src/lib.rs (target/debug/deps/tantivy-ccc74d3207250cdf)
  Executable tests/failpoints/mod.rs (target/debug/deps/failpoints-4ab25e264782755f)
  Executable tests/mod.rs (target/debug/deps/mod-49b9839018f790e3)
    Starting 1 tests across 3 binaries (672 skipped)
        SLOW [> 60.000s]             tantivy functional_test::test_functional_indexing_sorted
        FAIL [  98.674s]             tantivy functional_test::test_functional_indexing_sorted

--- STDOUT:                          tantivy functional_test::test_functional_indexing_sorted ---

running 1 test
test functional_test::test_functional_indexing_sorted has been running for over 60 seconds

--- STDERR:                          tantivy functional_test::test_functional_indexing_sorted ---
thread 'merge_thread_1' panicked at 'Unexpected error, empty readers in IndexMerger', src/indexer/merger.rs:353:14
stack backtrace:


PSeitz avatar Jul 28 '22 09:07 PSeitz