RWKV-LM icon indicating copy to clipboard operation
RWKV-LM copied to clipboard

Paper covering additional tokens idea

Open jph00 opened this issue 1 year ago • 5 comments

Hi there. You mention in the readme that you're interested in potentially adding some special tokens/markers to represent stuff like capitalisation. Just wanted to let you know we tried that in the ULMFiT paper, and it worked pretty well. You can read the details here: https://arxiv.org/abs/1801.06146 . We went beyond capitalisation and added some other tokens too.

Anyhoo this is just an FYI in case it's helpful to you.

jph00 avatar Feb 04 '23 20:02 jph00