yet-another-retnet icon indicating copy to clipboard operation
yet-another-retnet copied to clipboard

Change in how input projections are implemented. seem to converge faster

Open draguve opened this issue 1 year ago • 1 comments

draguve avatar Sep 28 '23 22:09 draguve

@draguve Thanks for this!

I'm not seeing a significant change in training convergence. Here are some brief training logs on the Project Gutenberg example:

  • red -> main with bias=False
  • green -> this branch
Screen Shot 2023-10-03 at 3 44 31 PM

I wonder if the issue is specific to your application, and possibly just due to differences in initialization.

fkodom avatar Oct 03 '23 20:10 fkodom