Hai

Results 3 issues of Hai

### System Info When I'm using "gpt-3.5-turbo-16k" model,This model supports 16k token.However, using the mp-reduce algorithm, if the answer obtained at A time exceeds 4000 tokens, this will be reported....

### Describe the problem ``` @app.get("/get_collections") async def get_collections(): try: client = HttpClient(host="127.0.0.1", port=8800) response = client.list_collections() result = [] for i in response: temp = { "id": i.id, "name":i.name,...

enhancement

### 📚 The doc issue Today, I deployed the Qwen3 embedding model (version 0.9.1) on a V100 GPU. The model starts up without errors, but when making requests, I encounter...

documentation