usdk Add SelfConsciousReplies component for context aware message response ignore

This PR adds a SelfConsciousReplies to the DefaultComponents of an Agent.

The SelfConsciousReplies component is a PerceptionModifier which is aimed to provide the agent with the capability of choosing not reply to an incoming "say" perception based on the following parameters:

Conversation chat history context
Conversation Members
Prev 6 messages to check if the agent has been back and forth (helps to reduce agent's interest in the conversation)
Agent's personality (bio)
Set of guidelines for the Agent to consider provided the context above

Currently getting in response to the thinking process above the following schema:

shouldRespond: boolean
reason: string (this is for helping debugging cases in the future to see what the thought process was and how the prompt can be improved later on)
confidence: number (the overall confidence score the Agent thinks it has to send a reply to this message)

If the agent has been back and forth in the conversation a lot, a backAndForthPenalty is applied to the agent's reply reasoning to lower down its overall confidence for replying to this message

Nov 23 '24 05:11 AbdurrehmanSubhani

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
chat	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 25, 2024 1:22pm
docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 25, 2024 1:22pm

Nov 23 '24 05:11 vercel[bot]

Additional thoughts/questions:

How much latency does gpt-4o-mini add to this pipeline?
Can we run an OODA loop in the main action pipeline to one-shot this action decision to eliminate the latency?
Can we make the OODA loop asynchronous from the action process to reduce latency?

Nov 25 '24 17:11 avaer

Additional thoughts/questions:

How much latency does gpt-4o-mini add to this pipeline?

Can we run an OODA loop in the main action pipeline to one-shot this action decision to eliminate the latency?

Can we make the OODA loop asynchronous from the action process to reduce latency?

The average latency with gpt-4o-mini is ~3 seconds
That has been implemented in https://github.com/UpstreetAI/upstreet-core/pull/699 removing the latency however maintaining the complete cost of the main action pipeline inference. This cost is cut-off as the perception is aborted before the action pipeline inference is triggered hence saving around ~26 upstreet credits for users
would be possible, the only downside being 2 inference costs for a chat message (~0.26 + ~26-30 upstreet credits)

Nov 28 '24 07:11 AbdurrehmanSubhani