Brad Skaggs
Brad Skaggs
I'm sorry, I haven't touched this in years. I *think* you have to run the processing pipeline to populate some files that are missing: https://github.com/karpathy/arxiv-sanity-preserver/blob/1682975ccca50f1a0582c806aacd07266f689061/README.md#processing-pipeline
I'm in a similar situation in an application I'm considering. As a hack, it may help to just shuffle the lines in your training file to help break the implied...
Could you have a mostly-ones mask that is the size of your input for each minibatch? Let it be zero where you want the sequences to zero-out the input state,...