Dmitry Kobak
Dmitry Kobak
> So my CL basically screws over all the people training Shakespeare. :/ Sorry about that. [...] So maybe we need a different config param setup for the shakespeare people....
I don't know how you could get to validation loss 1.22 on shakespeare_char -- Andrij got 1.47 (see README) and many people reported the same in various issues. But with...
It doesn't matter for the output, but if you really managed to push the validation loss down to 1.22 then I would be SUPER curious to know which parameters you...
I could not get validation loss below 1.45 by changing any of the basic model size parameters + dropout strength: ``` n_layer = 6 n_head = 6 block_size = 256...
Do you remember which parameters you were tweaking? 🙏
Thanks. I now see that you used full Shakespeare text from Project Gutenberg for this experiment, so this may explain the difference in the validation loss. Your version has ~3x...
@karpathy Project Gutenberg seems to have the entire Shakespeare (plays + sonnets + poems) in one TXT file available here: https://www.gutenberg.org/cache/epub/100/pg100.txt It has 182k lines. Removing publishing notes in the...
Just as a sanity check, does the unpickling fail if you run `pickle.load(f)` in a separate Python session from `pickle.dump(embedding, f)`? So first do pickle dump, then exit Python, then...
Just to clarify: is this the ONLY test that fails? I don't think this rest relies on Annoy: it uses the Iris data which is so small that openTSNE should...
> Or could it be that scipy computes distances a bit differently to scikit-learn on i386? No idea, but it does seem like the problem may be upstream. Would be...