Open-Assistant
Open-Assistant copied to clipboard
Load test different models in the inference-server
We want to test the performance of different models within the inference server to understand how it scales with model size such as;