Varun Gupta issues

Results 20 issues of


                                            Varun Gupta

Latency increases on enabling authentication for high number of connections

We are using Cassandra version 3.0.14 and use gocql driver. Number of open connections are 15k (which I understand is anti-pattern). Read = 8k qps Write = 3k qps Latencies...

waiting-for-info

WIP: Add unit test code coverage

Should we switch to different pod if model adapter load is failing for x times?

### Scenario 1 model adapter load is failing infinitely I added an error in model load. As expected model is not loading. But problem is that, in model adapter it...

kind/enhancement

help wanted

area/lora

event/bugbash

Aibrix-runtime container for mock app crash loops on liveliness probe failure

### 🐛 Describe the bug ![image](https://github.com/user-attachments/assets/2ef727f7-a390-440b-8b97-ab985a964fd0) ### Steps to Reproduce _No response_ ### Expected behavior _No response_ ### Environment _No response_

[WIP] Gateway refactoring

Add model API

Address https://github.com/aibrix/aibrix/issues/302

Use string based tokenizer in prefix cache

## Pull Request Description Use string based tokenizer to replace openai tokenizer. Reason is the latency overhead of openai tokenizer was 50 to 100ms. ## Related Issues Resolves: #673 **Important:...

WIP: Ignore worker pods for gateway routing

## Pull Request Description Ignores worker pods for gateway routing ## Related Issues Resolves: #[Insert issue number(s)] **Important: Before submitting, please complete the description above and review the checklist below.**...

Make stream include usage as optional

## Pull Request Description Make stream include usage as optional parameter. If request is for a user (user has default tpm limit, if not configured) then stream's include usage is...

Add check for containers ready

## Pull Request Description Along with pod ready condition check, add check for containers ready as well. ## Related Issues Resolves: #781 **Important: Before submitting, please complete the description above...