Dmitry Kobak comments

Results 110 comments of


                                            Dmitry Kobak

Difference In Char-Shakespeare Training Time on A100 With Pytorch 1.16 vs. Nightly

> So my CL basically screws over all the people training Shakespeare. :/ Sorry about that. [...] So maybe we need a different config param setup for the shakespeare people....

How coherent should we expect the shakespeare to be?

I don't know how you could get to validation loss 1.22 on shakespeare_char -- Andrij got 1.47 (see README) and many people reported the same in various issues. But with...

How coherent should we expect the shakespeare to be?

It doesn't matter for the output, but if you really managed to push the validation loss down to 1.22 then I would be SUPER curious to know which parameters you...

How coherent should we expect the shakespeare to be?

I could not get validation loss below 1.45 by changing any of the basic model size parameters + dropout strength: ``` n_layer = 6 n_head = 6 block_size = 256...

How coherent should we expect the shakespeare to be?

Do you remember which parameters you were tweaking? 🙏

How coherent should we expect the shakespeare to be?

Thanks. I now see that you used full Shakespeare text from Project Gutenberg for this experiment, so this may explain the difference in the validation loss. Your version has ~3x...

The input Shakespeare file does not contain the entire Shakespeare

@karpathy Project Gutenberg seems to have the entire Shakespeare (plays + sonnets + poems) in one TXT file available here: https://www.gutenberg.org/cache/epub/100/pg100.txt It has 182k lines. Removing publishing notes in the...

[Windows] save TSNEEmbedding to binary, Directory error

Just as a sanity check, does the unpickling fail if you run `pickle.load(f)` in a separate Python session from `pickle.dump(embedding, f)`? So first do pickle dump, then exit Python, then...

Test failure on i386

Just to clarify: is this the ONLY test that fails? I don't think this rest relies on Annoy: it uses the Iris data which is so small that openTSNE should...

Test failure on i386

> Or could it be that scipy computes distances a bit differently to scikit-learn on i386? No idea, but it does seem like the problem may be upstream. Would be...