llama2.c
llama2.c copied to clipboard
Simple text
These changes add support for training with tinyshakesphere (change from llama2.py), and simple blank line separated text.
Hello! Excuse me, I wrote a tinytext.txt of about dozens of lines. When I used
python tinyshakespeare.py
pretokenize and python train.py --dataset=tinyshakespeare
, the following error occurred:
assert num_batches > 0, "this split is way too small? investigate."
I just started to use llm. llama2.c can make it run on my own computer, but I don't have enough basic knowledge to quickly start training a large model of my own.
Could you please provide me with an example of a related tinytext.txt file? Thank you very much!