Lucas Newman comments

Results 37 comments of


                                            Lucas Newman

Training Unconditional Model

Can you try training in fp32 to debug it? If that works, we'll know that the issue is precision conversion and not an issue with the loss setup in the...

where to get the kmeans_path = './path/to/hubert/kmeans.bin file?

@furqan4545 you can find a pretrained quantizer for HuBERT base [here](https://github.com/facebookresearch/fairseq/blob/main/examples/hubert/README.md), and there are also utilities to create your own quantizer in the `simple_kmeans` directory from one of the larger...

where to get the kmeans_path = './path/to/hubert/kmeans.bin file?

@itsjamie You're on the right track, but for the pretrained models, you won't need the MFCC features, since they are only used in bootstrapping the base model during the first...

where to get the kmeans_path = './path/to/hubert/kmeans.bin file?

@itsjamie I think the coin flip is just to balance the training objectives and make sure both masking variations are being used during training. With regard to the batch dimension,...

where to get the kmeans_path = './path/to/hubert/kmeans.bin file?

Nice! There are some additional quantizers for HuBERT base that Meta made available [here](https://github.com/facebookresearch/textlesslib/tree/main/examples/expresso) if you want to compare and contrast. Let us know how it goes!

Training Example

> > ah, the code is all in there and @lucasnewman has already trained models successfully. i'll update the readme by end of week > > Hello. Will the weights...

EOS token not predicted while training from scratch

I'm not sure about the rotary embedding (Phil is the expert there), but in terms of the EOS token, some of it depends on how you're handling EOS in the...

EOS token not predicted while training from scratch

Ok, it's tricky to debug without seeing the code in that case, unfortunately. If you can't share it, I would check to make sure your padding and/or mask tokens aren't...

I have bad results of backtraslation

Can you share training/validation loss curves or the actual training code? It's hard to know the exact issue from the description. A couple of initial thoughts: 1) Your learning rate...

What is the final result like?

That's a pretty big model with 657M parameters, so you may be running out of memory. You could try reducing the size of the Conformer network — if you share...