Results 1 issues of P G

What is the purpose of ignore_index=-1 in loss calculation? I understand it's usually applied to exclude special tokens like padding, sequence end, etc. But nanoGPT does not seem to use...