fairseq2 icon indicating copy to clipboard operation
fairseq2 copied to clipboard

Missing documentation on how to train a model

Open VarunGumma opened this issue 1 year ago • 20 comments

Is there any documentation or examples that I can refer to train a transformer model from scratch using fairseq2? The examples folder in the repository seems empty.

VarunGumma avatar Oct 03 '23 14:10 VarunGumma

Hi, thanks for your interest. I'm working on that. It should arrive in a week or two.

gwenzek avatar Oct 03 '23 16:10 gwenzek

@gwenzek any update on the documentation

VarunGumma avatar Nov 08 '23 15:11 VarunGumma

Hi, thanks for your interest. I'm working on that. It should arrive in a week or two.

Any update on this @gwenzek ?

abdr17 avatar Dec 28 '23 12:12 abdr17

@VarunGumma @abdr17 We are working on our first open-source training recipe right now and plan to release it in early January. I will keep you posted once it is released.

cbalioglu avatar Dec 28 '23 16:12 cbalioglu

Hi @cbalioglu , I know generative models are on hype at the moment, but I am curious if fairseq2 would also support e.g. RoBERTa/XLM-R/data2vec pretraining :thinking: Would be awesome to have!

stefan-it avatar Jan 03 '24 19:01 stefan-it

@cbalioglu Could you please tell what kind of model training you are going to release? (e.g., BERT, GPT, etc.)

netw0rkf10w avatar Jan 18 '24 09:01 netw0rkf10w

Hey @netw0rkf10w, @stefan-it , the plan is to have training recipes for NLLB (encoder/decoder machine translation), wav2vec2/w2v-BERT (encoder-based speech embedding SSL), and fine-tuning recipe for LLaMA 7B/70B (in this order) in January/February. My goal is to cover the major architectures available in fairseq, so I can certainly try to prioritize other models depending on the interest/demand. Please let me know if you have some particular models in mind.

cbalioglu avatar Jan 18 '24 15:01 cbalioglu

Hi @cbalioglu , I would definitely vote for XLM-RoBERTa (there's still a lot of interest in, see EMNLP 2023 paper of XLM-V) and data2vec (1 and 2), because it has very promising and fresh new training objectives :)

Many thanks in advance!

stefan-it avatar Jan 18 '24 20:01 stefan-it

Thanks for the reply @cbalioglu ! I would like to vote for RoBERTa and data2vec as well!

netw0rkf10w avatar Jan 18 '24 21:01 netw0rkf10w

@cbalioglu Along these lines, I was wondering if fairseq2 is ready for multi-node pre-training and fine-tuning of wav2vec2? I currently use fairseq which has great support for efficiently doing distributed training. Will I lose any of that by switching to fairseq2?

Pchatain avatar Feb 23 '24 00:02 Pchatain

Hey @Pchatain, the recipe for encoder-decoder based machine translation is pretty much ready and I expect to merge it this week. I have started working on wav2vec2 and w2v-bert pretraining recipes (since we require them for our ongoing projects) and they will be ready in the next few weeks (definitely in March). Using those recipes will give you the same functionality as in fairseq.

cbalioglu avatar Feb 26 '24 17:02 cbalioglu

@cbalioglu any updates?

jcuenod avatar Apr 05 '24 18:04 jcuenod

@cbalioglu Hi, thanks for your work. I would like to vote for data2vec(1,2) too!

cageyoko avatar Apr 15 '24 07:04 cageyoko

Any news on this?

JAVI897 avatar Apr 15 '24 22:04 JAVI897

Are there any updates regarding w2v-bert?

kssmmm avatar May 06 '24 03:05 kssmmm

Is there an estimated timeline for when we will have documentation and training recipes for fariseq2 models specifically w2v-bert?

kdcyberdude avatar May 12 '24 19:05 kdcyberdude

Hi all, if you look at the github commits https://github.com/facebookresearch/fairseq2/commits/main/ you can see that they are working on it, I am not sure constant question about it is useful (@cbalioglu correct me if I am wrong).

I assume a +1 to their posts will be enough. On the other side @cbalioglu if possible, maybe stating training which models will be implemented will already give ppl some assurance that they should wait and not try to move to https://github.com/NVIDIA/NeMo or https://github.com/espnet/espnet ;-)

orena1 avatar May 12 '24 20:05 orena1

Hey folks, sorry for the delays. As @orena1 mentioned, we are actively working on the training recipes including more conventional ones like wav2vec2 and BERT originating from fairseq, as well as LLM pretraining and finetuning. We do use/develop these recipes internally in FAIR for various projects, so we want to make sure that they have full parity and expected runtime/model performance. We are very close to release the recipes for wav2vec2 pretraining, wav2vec2 ASR finetuning, and LLM instruction finetuning in the next few weeks.

cbalioglu avatar May 14 '24 20:05 cbalioglu

@cbalioglu Thank you for your work! Looking forward to the wav2vec2 pretraining recipe!

gau-nernst avatar May 30 '24 09:05 gau-nernst

https://github.com/facebookresearch/fairseq2/tree/main/src/fairseq2/recipes

jcuenod avatar Aug 23 '24 16:08 jcuenod