bergamot-translator
bergamot-translator copied to clipboard
Priority handling in the request queue
I've just jotted down an idea on how to handle priorities in the request queue.
I humbly think my suggestion is trivial to implement as a comparison operator between requests, and serves the purpose well. We may want to introduce a patience scaling factor to transform intuitive priorities (e.g. -20 (now!!!!!) - +20 (whenever ...) into actual patience levels. See my proposal for details. Comments are welcome!
@jerinphilip cc: @kpu
This space looks better for discussions. I'm preparing the code to work on this at the moment.
So you know, the current structure which holds prioritized sentences is Batcher::bucket_. It's a vector of std::set (coz amending priority), and I'm currently overriding the < operator on a RequestSentence which assigns this a priority. The next few changes will place guards on Batcher::bucket_ and will move adding a request into threads, making the calls to the same concurrent.
Priority is defined at a Request level, and RequestSentence order is used only to break ties within the same Request.
Request has a sequence number (Request::Id_). It can overflow, if a server runs that long to overflow, I'll probably be in good spirits then and happily add the reset code like how YouTube added 64-bit int when PSY broke view records. Id_ might find further use in a MultiFactorPriority, which can have fields corresponding to the sequence number, user-set nice value etc. There's also some idea of completion rates (how many sentences in the requests are remaining) being used to hasten certain almost-exit requests. In any case:
1, only the nice values will be user-configurable through a request handle provided on queuing. This is amending priority. I'm expecting I'll have to find the respective sentences in Batcher::bucket_. Gain lock and delete existing sentences, insert new sentences. This should update priority in O(N*logN) where N is Request.size().
2. A cancel will delete the request and forward deletion to the Batcher's structures (find and erase in Batcher::bucket_).
The algorithm to cleave batches currently is greedy (optimizing for packing efficiency). With MultiFactorPriority in, and an extra dimension of length this will become a scheduling problem, which I think I can find some efficient solutions online for. I think I can find some well-studied implementation which around some reduction to something like Job shop scheduling. I'm not sure why the bucket_ is in place now. Maybe it'd make it easier. Both you and @kpu suggested a bucket and it just went there back then.
@ugermann: Have a look at the source, you'll find elements which you're discussing already present there in some capacity. nice should correspond to your patience. Sequence ordering tie-breaking already exist. In addition there maybe an inverse priority (i.e, based on what fraction of the request is already completed?).
Implementation Roadmap
- [ ] Prepare concurrent queuing capability at
Batcher - [ ] Test scheduling optimizing with sequence-number and length. Possibly using
MultiFactorPriorityand keeping in mind future extensions. - [ ] Add methods to amend priority (this will add
nicetoMultiFactorPriorityand cancel forRequest.
Beyond this, It's been discussed the following are desirable:
- [ ] Introduce capacity in count of
Words, to block atBatcherand return queuing failures. This should prevent resources from being overrun when manyRequestsare made.