Add distributed testing for inference server

Open jackapbutler opened this issue 2 years ago • 1 comments

Overview

We want to test the dockerised inference-server under different stress conditions such as;

Load testing - handling many concurrent of users
Latency testing - speed of response to users

This should inform changes to the inference server as it can help diagnose bottlenecks in the backend. It also gives us a better idea on the compute requirements for hosting a worker or inference server node in different conditions.

Tasks

[x] #1622
[ ] #1628
[ ] #1629
[ ] #1623
[ ] #1624
[ ] #1625
[ ] #1626

Context

@yk suggested I work on this and I'm a research engineer at Faculty (https://faculty.ai/) committing almost full time to OS contributions for the foreseeable future

Feb 16 '23 15:02 jackapbutler

self-assign

Feb 16 '23 15:02 jackapbutler

Hi @jackapbutler do you still have interest in working on this? Or shall I unassign you from all load testing issues?

Jun 12 '23 17:06 olliestanley

Hey @olliestanley, no sorry I won't be able to commit to this now.

Jun 13 '23 06:06 jackapbutler

No problem :)

Jun 13 '23 07:06 olliestanley

Open-Assistant Open-Assistant copied to clipboard

Add distributed testing for inference server

Overview

Tasks

Context

Open-Assistant
Open-Assistant copied to clipboard