TinyLlama
TinyLlama copied to clipboard
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Sliding window context is one of the tricks that makes Mistral 7b so good at prompt following. Can you add this sliding window to tiny llama's architecture?
Do you plan on doing a writeup of your findings? It would appear that you reached saturation sometime between 2.5T and 3T tokens, but doesn't that blow away the typical...
When using GGUF and llama.cpp, is there a specific vocab file I should use, or can I use "ggml-vocab-llama.gguf"? The number of kv groups is different in TinyLlama so I...
Hey, sorry if I'm just too excited to see the final checkpoint of TinyLlama, but is the 3T checkpoint ready? The timeline on the README indicates it was supposed to...
After fine-tuning the model, I obtained a 2.2 GB PyTorch model.bin file. Is it possible to reduce this model size to 550 MB, and if so, how and when can...
If we merge adapters then this work for continuous learning ? I expect the new train model to provide output for previously trained data as well.
Your chat model looks great! How did you choose the datasets while finetuning on OpenAssisant repo? Otherwise, your chat model finetuning only include sft or also include RW RL training?
@jzhang38 saw the Chat Demo wasn't working, So I made the Chat WebUI for the model, like how we discussed at the Old Pull, its similar to [openchat](https://openchat.team) for UI,...
so that the pretrain command does work any more!
Who can help me?