nanoGPT
nanoGPT copied to clipboard

Published 20 hours ago •

Reame
Issues

Add some opinionated guide for fine-tuning

Open arivero opened this issue 1 year ago • 0 comments

It could be interesting to have some strong opinionated guide from the author addressing some typical issues:

The need or not of freezing some layers while fine tuning, and which ones.
The need or not of freezing some tokens while fine tuning. This means to freeze some part of the embedding matrix; it all the tokens are new, then it is just soft prompting.
The weight to be given to prompt tokens when calculating the loss. The OpenAI API currently uses 0.01 of the weight of completion tokens. In most other libraries, it is just one or zero (via masking).

Mar 07 '23 13:03 arivero