[WIP] Add create_audiobook function

Open Josh-XT opened this issue 1 year ago • 0 comments

Added create_audiobook function.

API Endpoint added for POST /v1/audio/book
- Accepts file upload
- Accepts voice and language (2 letter) in the body optionally. Will use default voice and en for the language if not defined.
- Narrator voice is used from voice
- If language is defined, it will translate the input to the desired language and output text and audio in the desired language.
Still need to add 100x male and female voices for random selection. Will synthesize these so that they're not real peoples voices.

graph TD
    A[Start] --> B[Chunk book content]
    B --> C{Paragraph > 2000 tokens?}
    C -->|No| D[Add paragraph to chunk]
    C -->|Yes| E[Split paragraph into sentences]
    E --> F[Group sentences up to 2000 tokens]
    F --> D
    D --> G[Process each chunk]
    G --> H{Extract characters, dialogue, and narration}
    H --> I[Merge similar characters]
    I --> J{Translation requested?}
    J -->|Yes| K[Translate content]
    J -->|No| L[Assign voices to characters]
    K --> L
    L --> M[Generate audio for each content item]
    M --> N[Combine audio segments]
    N --> O[Export final audiobook]
    O --> P[Save text output]
    P --> Q[End]

    subgraph "Chunking process"
    B
    C
    D
    E
    F
    end

    subgraph "For each chunk"
    G
    H
    I
    end

    subgraph "For each content item"
    M
    end

Oct 05 '24 13:10 Josh-XT