Jakub Piotr Cłapa

Results 77 comments of Jakub Piotr Cłapa

I am pretty sure you can view a table like this if you open the model file in [Netron](https://netron.app/). We don’t have any diagrams right now, the only documentation of...

Hey, that's a good catch, sorry about that. This seems to be a regression related to the optimization work we did recently. Since `vq_stoks` is not normally used during inference...

Both models use the semantic token ids but we also load the frozen token embeddings from the vq_stoks model and use linear projection layers to project them into the model...

Hey, that's a great idea and I experimented with this a bit. If you want lower latency right now you can split out the first sentence, generate it and before...

The inference for the currently released models is quite well optimized in PyTorch. If this is too slow then there are also smaller models available (`tiny` and `base`) which are...

I got similar results with my own SoundStorm implementation. It did figure out the silence (the pauses) from the semantic tokens and sometimes it would predict single sounds correctly (like...

I noticed that the model can generate audio that sounds better than the ground truth (both at 2 quantizers) with the accuracy of the free running (without teacher forcing) generation...