Dec branch commit

Open cesposo opened this issue 1 year ago • 0 comments

Option 1 - Flag Ablation code - ReLU - Restricted Setting - Softmax q-error +

Option 2 - Flag Scale Attention Weights based on the context window * log(query_position) [error that grows with context-length can be bound by the context-length]

Option 3 - Flag Instead of using positional encoding, we replace the first field with recurrent neural network or LSTM

Jan 05 '25 21:01 cesposo