makemore
makemore copied to clipboard
An autoregressive character-level language model for making more things
Hi! thanks for this little piece of juicy code! Just for curiosity, I've noticed that in your implementation you are using `nn.LayerNorm` with the standard denominator constant `eps=1e-5`, whereas in...
To those in the know, is there possibly any newer alternatives that work better and do the same thing as this? I fear I'm missing out on something more effective....
Updated Necessary package installation instructions before running the file $ pip install torch numpy tensorboard
If we had labels for these names, such as: ``` | name | is_palindrome | h_index | scrabble_score | |--------+---------------+---------+----------------| | anna | 1 | 4 | 4 | |...
Small code simplification: the line ``` chars = sorted(list(set(''.join(words)))) ``` can be simplified to ``` chars = sorted(set(''.join(words))) ``` because `sorted(...)` accepts any `iterable` and `set(...)` returns an `iterable`. I...
I wanted to train the program on making more Swedish names. They contain special characters like Å and Ö, so I need to read the file using utf-8. On windows...
[Here](https://github.com/karpathy/makemore/blob/988aa59e4d8fefa526d06f3b453ad116258398d4/makemore.py#L382) you are padding the tensor with special starting token. It looks strange to me that you are doing it inside the embedding. Isn't this strange? Aren't you supposed to...
hi thanks for your videos, just finished to watch the [first part](https://www.youtube.com/watch?v=PaCmpygFfXo) when I tried to intersect between the test & train datasets I noticed some names repeat in the...
I tried not to. I had to. It's the small things, right? Favoring verbosity, you could say "This is not meant to be too heavyweight a library" but that is...
Hi @karpathy, thanks for that great repo! Maybe it would be better to note in your code that while you're training by [minimizing the CE loss](https://github.com/karpathy/makemore/blob/f61811b994280cb12ddae15ef5800baa2e3a1ca4/makemore.py#L392), Bengio actually **maximized** the...