TinyLlama issues

Add support for sliding window context similar to Mistral?

4

Sliding window context is one of the tricks that makes Mistral 7b so good at prompt following. Can you add this sliding window to tiny llama's architecture?

Harleen8118

Results vs Chinchilla

Do you plan on doing a writeup of your findings? It would appear that you reached saturation sometime between 2.5T and 3T tokens, but doesn't that blow away the typical...

SpaceCowboy850

Full fine-tuning with llama.cpp

When using GGUF and llama.cpp, is there a specific vocab file I should use, or can I use "ggml-vocab-llama.gguf"? The number of kv groups is different in TinyLlama so I...

RonanKMcGovern

3T checkpoint?

4

Hey, sorry if I'm just too excited to see the final checkpoint of TinyLlama, but is the 3T checkpoint ready? The timeline on the README indicates it was supposed to...

coder543

The 4bit-quantized TinyLlama-1.1B's weight only takes up 550MB RAM ?

5

After fine-tuning the model, I obtained a 2.2 GB PyTorch model.bin file. Is it possible to reduce this model size to 550 MB, and if so, how and when can...

TapendraBaduwal

How can we enable continuous learning with the Tiny Llama model ?

If we merge adapters then this work for continuous learning ? I expect the new train model to provide output for previously trained data as well.

TapendraBaduwal

the dataset selection sft on OpenAssisant

1

Your chat model looks great! How did you choose the datasets while finetuning on OpenAssisant repo? Otherwise, your chat model finetuning only include sft or also include RW RL training?

littleSunlxy

Working Chat Demo

7

@jzhang38 saw the Chat Demo wasn't working, So I made the Chat WebUI for the model, like how we discussed at the Old Pull, its similar to [openchat](https://openchat.team) for UI,...

VatsaDev

The lighting app was updated, and does not support run model ！

1

so that the pretrain command does work any more!

imanu20

ncclRemoteError

1

Who can help me？

JerryDaHeLian

TinyLlama
TinyLlama copied to clipboard

Metadata

Add support for sliding window context similar to Mistral?

Results vs Chinchilla

Full fine-tuning with llama.cpp

3T checkpoint?

The 4bit-quantized TinyLlama-1.1B's weight only takes up 550MB RAM ?

How can we enable continuous learning with the Tiny Llama model ?

the dataset selection sft on OpenAssisant

Working Chat Demo

The lighting app was updated, and does not support run model ！

ncclRemoteError

← Metadata

Owner

Metadata

TinyLlama TinyLlama copied to clipboard

Metadata

← Metadata

Owner

Metadata

TinyLlama
TinyLlama copied to clipboard