camel icon indicating copy to clipboard operation
camel copied to clipboard

[Feature Request] Add message queue between model clients and vllm or ollama servers

Open lightaime opened this issue 1 year ago • 7 comments

Required prerequisites

  • [X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
  • [ ] Consider asking first in a Discussion.

Motivation

For load balance between multiple model clients and vllm or ollama servers:

  • [ ] Add Kafka: https://kafka.apache.org/08/documentation.html

The message queue abstraction can be also used for workforce or task assignment. We should take this into consideration.

Solution

No response

Alternatives

No response

Additional context

No response

lightaime avatar Jul 19 '24 13:07 lightaime

Perhaps RabbitMQ is also a good choice?

Asher-hss avatar Jul 19 '24 14:07 Asher-hss

from Guohao: we can use message queue to do message exchange and load balance in CAMEL the design maybe a little bit different based on the case

Wendong-Fan avatar Aug 12 '24 13:08 Wendong-Fan

Agent - Agent Task -Agent Agent - Model

the queue is independent and will not expose to user

Wendong-Fan avatar Aug 12 '24 13:08 Wendong-Fan

A good design may be

AgentModelLoaderBalancer(model_server_urls: List[str]) -> loader_balancer_url: str

lightaime avatar Aug 12 '24 13:08 lightaime

can make it as an independent package

Wendong-Fan avatar Aug 12 '24 13:08 Wendong-Fan

Could you please provide more context about the feature and its requirements? I have the following questions:

  1. What is the size of the messages (average or maximum, or estimation)?
  2. What kind of message ordering is required? For example, total ordering across agents, per-agent ordering, or is ordering not a concern?
  3. What level of message delivery guarantee is needed (e.g., at least once, at most once, exactly once)?

Additionally, could someone please explain the abstraction of 'Task' to me? I would really appreciate it.

Thanks!

sfc-gh-yihuang2 avatar Aug 15 '24 16:08 sfc-gh-yihuang2

@sfc-gh-yihuang-2 Hello, thank you for your interest in this issue. Since this feature is still under research and design, some information may not be accurate:

  1. The number of messages will far exceed what the limited llm models can "evenly" handle (it will be on a much larger scale).
  2. Message order is based on priority (msg priority or agent priority) and some other optimization algorithms.
  3. We prefer "exactly once," but theoretically, we should ensure at least once.

The Task is just a task that needs to be solved by the agents.

Appointat avatar Aug 16 '24 09:08 Appointat