intentkit icon indicating copy to clipboard operation
intentkit copied to clipboard

Feat : Venice audio (Text to Speech)

Open yornfifty opened this issue 10 months ago • 3 comments

Description

Please include a summary of the changes and the related issue.

Text to speech with venice ai, support multiple voice models

af_alloy, af_aoede, af_bella, af_heart, af_jadzia, af_jessica, af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky, am_adam, am_echo, am_eric, am_fenrir, am_liam, am_michael, am_onyx, am_puck, am_santa, bf_alice, bf_emma, bf_lily, bm_daniel, bm_fable, bm_george, bm_lewis, zf_xiaobei, zf_xiaoni, zf_xiaoxiao, zf_xiaoyi, zm_yunjian, zm_yunxi, zm_yunxia, zm_yunyang, ff_siwis, hf_alpha, hf_beta, hm_omega, hm_psi, if_sara, im_nicola, jf_alpha, jf_gongitsune, jf_nezumi, jf_tebukuro, jm_kumo, pf_dora, pm_alex, pm_santa, ef_dora, em_alex, em_santa

Utils.s3 update

now we can use store_file_bytes to store most common file type and ability to limit the filesize

class FileType(str, Enum):
    IMAGE = "image"
    VIDEO = "video"
    AUDIO = "audio"
    PDF = "pdf"


async def store_file_bytes(
    file_bytes: bytes,
    key: str,
    file_type: FileType,
    size_limit_bytes: Optional[int] = None,
) -> str:

Type of Change

  • [ ] Bugfix
  • [x] New Feature
  • [x] Improvement
  • [ ] Documentation Update

Checklist

  • [x] I have read the contributing guidelines.
  • [ ] I have added tests to cover my changes.
  • [x] All new and existing tests passed.

*Note

i'm not add this to models/agent_schema.json's property yet, to avoid conflic with other pr

tell me if i should add this right away

        "venice_audio": {
          "title" : "Venice Audio",
          "$ref": "../skills/venice_audio/schema.json"
        },

Related Issue

  • https://github.com/crestalnetwork/intentkit/issues/503
  • https://github.com/crestalnetwork/intentkit/issues/314

Showcase

i made a simple chat app to interact with the agent to showcase the functionality https://crestal.s3.ap-southeast-1.amazonaws.com/local/intentkit/2025-04-27%2003-40-17.mp4

yornfifty avatar Apr 26 '25 21:04 yornfifty

Hi @yornfifty , great work here!

I see these 2 minor issues, this PR can be merged as soon as these are fixed, thanks!

  1. Config class incorrectly uses TypedDict instead of inheriting from SkillConfig:
# Current problematic code:
class Config(TypedDict):
    enabled: bool
    api_key: str
    states: SkillStates

# Should be changed to:
class Config(SkillConfig):
    """Configuration for Venice Audio skills."""
    enabled: bool
    api_key: str
    states: SkillStates
  1. Missing proper base.py implementation:
# Missing required base.py file with proper inheritance pattern:
from typing import Type
from pydantic import BaseModel, Field
from abstracts.skill import SkillStoreABC
from skills.base import IntentKitSkill

class VeniceAudioBaseTool(IntentKitSkill):
    """Base class for Venice Audio tools."""
    
    name: str = Field(description="The name of the tool")
    description: str = Field(description="A description of what the tool does")
    args_schema: Type[BaseModel]
    skill_store: SkillStoreABC = Field(description="The skill store for persisting data")
    
    @property
    def category(self) -> str:
        return "venice_audio"

bluntbrain avatar May 05 '25 06:05 bluntbrain

thank you for the review, i already fix it, should i also add it to agent_schema.json?

yornfifty avatar May 05 '25 08:05 yornfifty

latest test screenshot image

yornfifty avatar May 27 '25 06:05 yornfifty