quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

Makes the MAX_CONCURRENT_SPLIT_UPLOAD parameter configurable

Open guidao opened this issue 1 year ago • 1 comments

This parameter currently uses the same value for all uploaders. This means that the larger the num_pipeline, the more incentive there is to compete.

The "Maximum application poll interval (max.poll.interval.ms) exceeded" error we get when using it is probably related to this parameter. I tried to change this parameter to 200 and it seems to be more stable in consumption

image

guidao avatar Oct 07 '22 02:10 guidao

@evanxg852000 can you make the parameter configurable? (we do want to keep one global limit for all pipeplines)

fulmicoton avatar Oct 07 '22 05:10 fulmicoton

@fulmicoton @guilload I wanted to go further by removing the global CONCURRENT_UPLOAD_PERMITS from uploader.rs to make permits that gets passed down from IndexingService to uploaders. However, there is the JanitorService that creates uploaders via its DeleteTaskService. Options are:

  • Keep the global variable, wrapping it in OnceCell (current solution)
  • Create permits in quickwi_serve as Arc<Semaphore> and share among IndexingService and JanitorService? (permit is passed down to many intermediate before reaching target uploader)
  • Create permits for each of IndexingService and JanitorService with the assumption that JanitorService and IndexingService are not expected to run on the same node?

Is there any reason we are currently acquiring the semaphore outside of the upload tokio task? https://github.com/quickwit-oss/quickwit/blob/main/quickwit/quickwit-indexing/src/actors/uploader.rs#L232

evanxg852000 avatar Oct 10 '22 19:10 evanxg852000

@evanxg852000 I would go for the simplest solution, keep the global variable and just make it configurable as the delete service should be run on another node anyway and I think this is "ok" to share the permits in a single node deployment.

fmassot avatar Oct 10 '22 20:10 fmassot

One constraint that has just cross my mind: we should be able to start several servers in tokio tasks for testing purpose, so maybe having a config variable passed to the indexing / delete service would be better.

fmassot avatar Oct 10 '22 20:10 fmassot

Ok @fmassot, Thanks for jumping in. I will keep the global solution, the constraint can be satisfied.

evanxg852000 avatar Oct 10 '22 21:10 evanxg852000