quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

Quickwit consumes a lot of RAM when indexing on 1000 indexes and more

Open fmassot opened this issue 1 year ago • 4 comments

An indexer can take more than 15GB; it was observed on a k8s cluster, which has 1000 indexes.

Possible explanation: we have several queues in a given indexing pipeline, and each queue can stack up to X messages. For example, the queue size of the doc processor is 10, and given that the source is generating a message of size 1MB, we can deduce it will take 10GB if doc processor queues are full for all indexes...

We should first confirm this is the correct explanation.

Possible solution: define a RAM budget at the server level and use it to limit the number of messages waiting in the doc processor queue (or other actors?). The budget could be equivalent to the ingest API memory budget.

fmassot avatar Feb 20 '24 16:02 fmassot

I've instrumented quickwit to know where allocation happened. On a quickwit running just the indexer service, with 1.5k indexes all getting fed lightly, i get 8.9GiB allocated memory with following usages:

  • mrecordlog: 750MiB
  • quickwit_ingest: 1.0GiB inside a ServiceStream containing a tonic Streaming decoder
  • hyper::proto: 780MiB protobuf records for MRecordBatch inside a FetchMessage 420MiB of serializing things in StoreWriter, some SegmentWriter and FastFieldsWriter...
  • IndexerState::index_batch: 5.4GiB stacker's MemoryArena: 2.9 GiB (inside Page) stacker'd ArenaHashMap: 1.3 GiB SingleSegmentIndexWriter::add_document: 640MiB (about half StoreWriter, then some Segment and FastField writers)

we could maybe reduce the size of an arena Page (currently 1MiB), as this seems to be mostly page allocated while creating the structure, and rarely any new allocation after a page gets filled, but this could cost performance in a workload indexing more on fewer indexes. Some goes for ArenaHashMap's size

or we could start less indexing pipelines at once, going through all docs in one queue, committing as soon as the queue is empty, and leaving that index alone for some time while working on others, and getting back after a short while

trinity-1686a avatar Feb 22 '24 11:02 trinity-1686a

are you using cooperative indexing?

fulmicoton avatar Feb 22 '24 11:02 fulmicoton

enable_cooperative_indexing: true in the indexer config.

fulmicoton avatar Feb 22 '24 11:02 fulmicoton

with cooperative indexing, things are much more tame. If I increase the number of concurrent request (i'm currently at 800), memory is dominated by buffers allocated in warp/hyper, to handle the request's body. I assume they are quickly given to us, and we keep them for some time, passing them around

trinity-1686a avatar Feb 22 '24 15:02 trinity-1686a