Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Add distributed testing for inference server

Open jackapbutler opened this issue 2 years ago • 1 comments

Overview

We want to test the dockerised inference-server under different stress conditions such as;

  1. Load testing - handling many concurrent of users
  2. Latency testing - speed of response to users

This should inform changes to the inference server as it can help diagnose bottlenecks in the backend. It also gives us a better idea on the compute requirements for hosting a worker or inference server node in different conditions.

Tasks

  • [x] #1622
  • [ ] #1628
  • [ ] #1629
  • [ ] #1623
  • [ ] #1624
  • [ ] #1625
  • [ ] #1626

Context

@yk suggested I work on this and I'm a research engineer at Faculty (https://faculty.ai/) committing almost full time to OS contributions for the foreseeable future

jackapbutler avatar Feb 16 '23 15:02 jackapbutler

self-assign

jackapbutler avatar Feb 16 '23 15:02 jackapbutler

Hi @jackapbutler do you still have interest in working on this? Or shall I unassign you from all load testing issues?

olliestanley avatar Jun 12 '23 17:06 olliestanley

Hey @olliestanley, no sorry I won't be able to commit to this now.

jackapbutler avatar Jun 13 '23 06:06 jackapbutler

No problem :)

olliestanley avatar Jun 13 '23 07:06 olliestanley