it looks like windows is deleting files that contains js shellcode exploits causing the load to fail.
It sounds like you are looking for something like I am currently also trying to do similar, but have yet to even get data gathering working right, let alone...
Isn't that literally what Num_samples in is for? runs off x number of possible sequences? or are you meaning stop part way through and then give options of branching...
I would try modifying the clone of alpaca to NanoGPT and trying to get it to instructGPT levels like it seems to be at. that being said, you might...
well, its the TinyShakespeare dataset. which is labeled as 40000 lines of Shakespeare, so yes, on purpose.