where is the batching part?
for example, if there are multiple clients send request to a server, the request will be construct a batch forward or forward them one by one?
Yes. If multiple clients send request to the same model simultaneously (or in a high throughout situation), Clipper will send the largest batch possible to guarantee the latency slo. We avoid sending query one by one as much as possible. Because we want to take advantage of model’s ability to handle a batch of input.
Thanks, where is the batching part in source code?
Batch size is determined at https://github.com/ucbrise/clipper/blob/3c5a1cc6ce59e0ccd778f526a50808d0e7b2576f/src/libclipper/src/containers.cpp#L128 We use an estimation method to adaptively figure out the optimal batch size per model.
The batching happen inside query frontend, which are the cpp code contained in src/ directory. Each model has a queue that aggregate the queries from (potentially) different sources.
Thanks for your reply, by the way, how can I debug the source code according to request?
Can you clarify? Which part of Clipper do you want to debug? Your model log can be checked via docket log or kubectl log.
I’m asking because Clipper has a distributed architecture and distributed system is usually hard to debug as a whole.
I want to debug into the batching part so I can make it clear how it works since I would like to use it in my own c++ project, is there an easy way to do this?
One way to do it is to modify the dockerfile for QueryFrontend, instead of
https://github.com/ucbrise/clipper/blob/3c5a1cc6ce59e0ccd778f526a50808d0e7b2576f/dockerfiles/QueryFrontendDockerfile#L17
You can write:
RUN apt-get install -y gdb
ENTRYPOINT ["gdb", "/clipper/release/src/frontends/query_frontend"]
And build the docker image by running the following inside $CLIPPER/dockerfiles:
docker build -t clipper/query-frontend:0.3.0 -f QueryFrontendDockerfile ..
And then run your normal Clipper pipeline, after sending the request, you can attach gdb session inside the docker container (you can find the id by docker ps:
docker exec -it <Query Frontend Container ID> /bin/bash
and once inside,
top # to find the PID for query frontend process
gdb -p $PID # fill in the PID for query frontend process, I believe it should be 0 or 1 but it's worth checking
Caveat: Query Frontend code is multi-threaded and threads communicate via Concurrent Queues. This will probably be hard to debug. Finding resources about debugging threaded program with gdb should help.
Thanks for your warmly reply, I will try it. appreciate your wonderful work