quickwit Documents loose after their ingestion

Describe the bug After the ingestion 1.8T of data we lost 9 documents. We log every response from quickwit during ingestion. And we see in logs

{"num_docs_for_processing":7000} #247545 times
{"num_docs_for_processing":1000} #2 times
{"num_docs_for_processing":71} #1 time

So we expected 1732817071 docs as a result but got 1732817062

# curl -H "Content-type: application/json" -X POST \
> http://localhost:7280/api/v1/taxi/search/ \
> -d '{"query":"*","max_hits":0,"aggs":{"count(*)":{"value_count":{"field":"id"}}}}'
{ 
  "num_hits": 1732817062,
  "hits": [],
  "elapsed_time_micros": 1679406,
  "errors": [],
  "aggregations": {
    "count(*)": {
      "value": 1732817062.0
    }
  }
}

Steps to reproduce (if applicable)

This is a big amount of data so we can't provide the dump easily.

You can reproduce this issue via databases comparing tool

Clone comparing tool

git clone [email protected]:db-benchmarks/db-benchmarks.git
cd db-benchmarks
git checkout feat/quickwit

Copy .env.example to .env
Update cpuset in .env with the default value of CPUs that your machine has
Open the test folder

cd tests/taxi

Add exit 1 to prevent other engines init (It doesn't affect our issue and save us space)
Run ./init

Ingestion will take 3-4 days after you will see the problem.

Expected behavior 1732817071 count of docs as results

Configuration: Please provide:

Output of quickwit --version 0.8.1
The index_config.yaml

Sep 30 '24 15:09 KlimTodrik

If documents don't match the schema, they won't be indexed, which may cause the mismatch

Oct 01 '24 01:10 PSeitz

If documents don't match the schema, they won't be indexed, which may cause the mismatch

Should it answer with some error? Cause we don't see any error responses

Oct 01 '24 21:10 KlimTodrik

No, I think it only logs errors currently

Oct 02 '24 00:10 PSeitz

No, I think it only logs errors currently

There are 1,732,817,071 documents, so analyzing all logs to find the error is quite complex. I think it would be much better to notify the user directly when something goes wrong, either via the response (not just a 200 status) or by providing a dedicated errors endpoint

Jan 10 '25 15:01 KlimTodrik

The link to your index config doesn't work, but maybe the retention policy kicked in? If not specified there may be a default one, not sure.

Mar 14 '25 18:03 mrcnski

quickwit quickwit copied to clipboard

Documents loose after their ingestion

quickwit
quickwit copied to clipboard