ezlocalai
ezlocalai copied to clipboard
[WIP] Add create_audiobook function
Added create_audiobook function.
- API Endpoint added for
POST /v1/audio/book- Accepts file upload
- Accepts
voiceandlanguage(2 letter) in the body optionally. Will use default voice andenfor thelanguageif not defined. - Narrator voice is used from
voice - If
languageis defined, it will translate the input to the desired language and output text and audio in the desired language.
- Still need to add 100x male and female voices for random selection. Will synthesize these so that they're not real peoples voices.
graph TD
A[Start] --> B[Chunk book content]
B --> C{Paragraph > 2000 tokens?}
C -->|No| D[Add paragraph to chunk]
C -->|Yes| E[Split paragraph into sentences]
E --> F[Group sentences up to 2000 tokens]
F --> D
D --> G[Process each chunk]
G --> H{Extract characters, dialogue, and narration}
H --> I[Merge similar characters]
I --> J{Translation requested?}
J -->|Yes| K[Translate content]
J -->|No| L[Assign voices to characters]
K --> L
L --> M[Generate audio for each content item]
M --> N[Combine audio segments]
N --> O[Export final audiobook]
O --> P[Save text output]
P --> Q[End]
subgraph "Chunking process"
B
C
D
E
F
end
subgraph "For each chunk"
G
H
I
end
subgraph "For each content item"
M
end