audiolm-pytorch
audiolm-pytorch copied to clipboard
hubert instead of w2v-bert ?
Did you use hubert instead of w2v-bert (used in the original paper) because of the avilability of the hubert model on HF, or because of other reasons, i.e., have you seen hubert results that support this replacement?
@eonglints recommended it, and he is much more well read on the literature
@jackieassa but i am open to supporting as many base models as you think fit, now that we have seen this architecture solves music generation
Sorry I missed this. Yeah, a combination of both reasons; the HuBERT model and k-means model are available, and also, a bunch of literature prior to AudioLM has used this HuBERT model and clustering method successfully.