bark icon indicating copy to clipboard operation
bark copied to clipboard

Make Bark a HF Pre-Trained Model

Open sanchit-gandhi opened this issue 1 year ago • 3 comments

PR to make the Bark model a HF PreTrainedModel. The PreTrainedModel class takes care of all loading / saving logic, enabling checkpoints to be downloaded / pushed directly to the HF Hub:

from bark import GPT

model = GPT.from_pretrained("suno/bark-text")  # load model weights from Hub

Model weights on the HF Hub have version control and download counters. Users can also filter HF Hub models by type, e.g. by TTS.

Preliminarily, this PR only makes the required modelling code changes. The next step of the PR is to update generation.py, namely replacing the _load_model functionality with a single .from_pretrained call: https://github.com/suno-ai/bark/blob/d621ee3088f29f6d12c2d8b0503e2368f18fabd9/bark/generation.py#L169

Since Transformers is already a dependency of Bark, this adds no new dependency requirements. It also has no effect on the .forward call (functionality remains 1-to-1 the same).

For details on the PreTrainedModel class, refer to the code and docs.

sanchit-gandhi avatar Apr 17 '23 13:04 sanchit-gandhi

awesome, thanks sanchit. what do we need to do on the HF side for this to work?

gkucsko avatar Apr 21 '23 18:04 gkucsko

Hi @gkucsko - I'm VB, and I work with @sanchit-gandhi on the Open Source Audio team at Hugging Face. IMO in terms of next steps, we would need the open source model checkpoints to: https://huggingface.co/suno and then the .from_pretrained method should work with the checkpoints on suno org. Happy to help with it.

We can also add the model card there along with inference details to help with the discovery.

Vaibhavs10 avatar Apr 24 '23 11:04 Vaibhavs10

awesome thanks, super swamped rn, but promise i'll get back to it later in the week!

gkucsko avatar Apr 24 '23 15:04 gkucsko