bark icon indicating copy to clipboard operation
bark copied to clipboard

Technical guide and explanation to Bark

Open kennethleungty opened this issue 1 year ago • 4 comments

Hi all, I just wrote a detailed article on the technical aspects within Bark. Feel free to check out the article here: https://betterprogramming.pub/text-to-audio-generation-with-bark-clearly-explained-4ee300a3713a?sk=e2b2f75f5fc93c656bef031c60bf99bf

kennethleungty avatar Oct 14 '23 06:10 kennethleungty

Nice writeup. I would just note that

Bark has various voices for speech generation based on language, gender, and background sounds. The complete voice collection with more than 100 presets can be found in the speaker library.

Is not the complete collection of voice. Even without add-ons for voice cloning you can simply not specify a voice in Bark and Bark will generate a new random voice on the spot. That voice can be saved and used again -- the voice is simply the audio sample itself, in raw tokens. So Bark is not limited to 100 voices, it's infinite voices. And because Bark tries to match the text prompt to a voice you can use Bark like a voice lab just by text prompting, creating and then refining/tweaking the voices with additional prompts/resaves.

JonathanFly avatar Oct 19 '23 02:10 JonathanFly

Good point there, I have made the edits. Thanks for highlighting this

kennethleungty avatar Oct 19 '23 02:10 kennethleungty

@kennethleungty can you do a writeup on serpai's voice cloning addon? I would love to understand on a technical level how to train new voices at a high enough qualtiy that matches Bark's voice library.

platform-kit avatar Nov 02 '23 01:11 platform-kit

@kennethleungty can you do a writeup on serpai's voice cloning addon? I would love to understand on a technical level how to train new voices at a high enough qualtiy that matches Bark's voice library.

Yes me too, and also train the model with a new language

boringtaskai avatar Jan 11 '24 16:01 boringtaskai