emergent_in_context_learning Have you experiment with smaller size model

Have you experiment with smaller size model

Open La-SilverLand opened this issue 2 years ago • 0 comments

Hi, in the paper, a transformer model with 12 layers and embdding size 64 is used to validate your hypothesis. Did you do any trial experiment on smaller sized model ? and if you did, what's the result ?

Feb 10 '23 07:02 La-SilverLand

emergent_in_context_learning emergent_in_context_learning copied to clipboard

Have you experiment with smaller size model

emergent_in_context_learning
emergent_in_context_learning copied to clipboard