bark
bark copied to clipboard
Technical guide and explanation to Bark
Hi all, I just wrote a detailed article on the technical aspects within Bark. Feel free to check out the article here: https://betterprogramming.pub/text-to-audio-generation-with-bark-clearly-explained-4ee300a3713a?sk=e2b2f75f5fc93c656bef031c60bf99bf
Nice writeup. I would just note that
Bark has various voices for speech generation based on language, gender, and background sounds. The complete voice collection with more than 100 presets can be found in the speaker library.
Is not the complete collection of voice. Even without add-ons for voice cloning you can simply not specify a voice in Bark and Bark will generate a new random voice on the spot. That voice can be saved and used again -- the voice is simply the audio sample itself, in raw tokens. So Bark is not limited to 100 voices, it's infinite voices. And because Bark tries to match the text prompt to a voice you can use Bark like a voice lab just by text prompting, creating and then refining/tweaking the voices with additional prompts/resaves.
Good point there, I have made the edits. Thanks for highlighting this
@kennethleungty can you do a writeup on serpai's voice cloning addon? I would love to understand on a technical level how to train new voices at a high enough qualtiy that matches Bark's voice library.
@kennethleungty can you do a writeup on serpai's voice cloning addon? I would love to understand on a technical level how to train new voices at a high enough qualtiy that matches Bark's voice library.
Yes me too, and also train the model with a new language