Andrej
Andrej
Your suggestion isn't quite right either though, e.g. requirements.txt is left outside? and configurator, etc. And probably should hide away the tokenizer too
There is no TinyStories Chat dataset 😢
@xefoci7612 fun idea! This would def work as a finetuning dataset to create a TinyStoriesChat. Of course, the chat model would be constrained to the "universe" of tiny stories, which...
Does anyone understand how -O3 can possibly be causing this issue?
I can't repro this issue on my computer. Would it maybe work to change the check as: ``` if (model->mean_loss < 0) { printf("Error: must forward with targets before backward\n");...
I added a comment in README for now. I don't have a good sense of when the code works or does not work, so it feels hard to change the...
What is the advantage?
ok for now i think
definitely! but this is pretty far down the line, i think we first need to get the 1-GPU version to be super solid.
Sounds great! I expect to get started with the backward pass somewhere over the weekend most likely. (I spent today optimizing the forward pass still) Once we have the backward...