Open-Assistant
Open-Assistant copied to clipboard
Add initial load tests
- closes #1622
I've used locust to write a basic load test which will hit two endpoints sequentially that mimic the text-client
in the inference server.
-
/chat
to start a new conversation withchat_id
-
/chat/{chat_id}/message
to send a message to the Assistant
An isolated load test user workflow is summarised by first spawning $X$ users every $T$ seconds to a maximum amount of $N$ concurrent users
- A user starts a conversation with the Assistant
- Then they enter a conversation loop
- Send a chat message to the Assistant
- Wait until Assistant responds
- Wait $S$ further seconds
- Repeat
Initial results showed the Bot failing (potentially due to a race condition) and I'm currently investigating the source of that error. The main error happens inside of add_prompter_message()
and is is caused by the chat.pending_message_request
status being non None
giving rise to a HTTPError
of “Already pending”.