FLAN icon indicating copy to clipboard operation
FLAN copied to clipboard

How many samples were used to train Flan-T5?

Open rohan-mehta opened this issue 1 year ago • 0 comments

Hey all, possibly silly question. I see that the huggingface collection has many millions of samples, and the google blog post suggests that the collection has 15M samples: https://ai.googleblog.com/2023/02/the-flan-collection-advancing-open.html

On the other hand mixtures.py suggests that ~350K samples is the default maximum: https://github.com/google-research/FLAN/blob/main/flan/v2/mixtures.py#L27

How many samples were actually used to fine tune T5 and produce Flan-T5?

Thanks!

rohan-mehta avatar Aug 07 '23 17:08 rohan-mehta