Output doesn't match demos
Hi I ran this locally
from bark import SAMPLE_RATE, generate_audio
from IPython.display import Audio
text_prompt = """
♪ In the jungle, the mighty jungle, the lion barks tonight ♪
"""
audio_array = generate_audio(text_prompt)
Audio(audio_array, rate=SAMPLE_RATE)
Play the file: https://user-images.githubusercontent.com/7272343/233417746-dbf0ab65-49c7-477c-9373-1b4f87bdfb5e.mp4
Very odd sound any ideas?
hm, yeah that sounds like a dud. try generating a few more times and see what it sounds like? it's an autoregressive model like GPT, so sometimes it goes off the rails. especially music can sometimes cause that cause rhythm increases the chances of a weird loop. We're working on improving things but for now you can just run it a few times till you get something good probably
This one came out pretty funny, almost like it didn't wanna sing
Sample: https://user-images.githubusercontent.com/7272343/233423797-3f6f45eb-65c5-4e0b-b656-0714c92a3821.mp4
😂
In our experiments with this set of models, for more interesting tokens such as ♪, you might need several generations to get something very good, but the good generations are very good.