pranjicm
pranjicm
I just found https://github.com/huggingface/pytorch-openai-transformer-lm/issues/12 and https://github.com/openai/finetune-transformer-lm/issues/9 Not sure how I missed the first issue, I have looked at the open issues - but it shows that comments in this code...
Ok, I'll create the pull request.
Tokens from 0 to `n_vocab` are tokens from the vocabulary (data), from `n_vocab` to `n_vocab+n_special` are special tokens: `_start_`, `_delimiter_` and `_classify_`. All above, up to `n_vocab + n_special +...