Haoyan Luo issues

Repositories
Issues
Comments

Results 3 issues of


                                            Haoyan Luo

Fix to include ln_final.w in RMSNorm hook

# Description Before fix, `model_cache["ln_final.hook_normalized"]` will only return the RMS normalized hidden states without multiplying the `final_ln` weight. This might contractict with the design of this hook. I followed the...

no-rebase

Question on the source of commonsense_15k

Hi there, thanks for your work! I want to inquire about the source of the commonsense_15k dataset, as I didn't find it in the paper nor described in this repo.

Questions Regarding "Sink Tokens"

Hi! Thank you for you interesting paper and its implementation! I have a few questions I hope you can clarify: 1. When employing the pre-trained model with a "sink token,"...