TinyLlama
TinyLlama copied to clipboard
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Hi, May I ask a simple question: You claim 24K token/s with 1.1B model which is 56% efficiency. But my cuda code with pure cublas GEMM calls on 2048*2048 matrix...
Hello, is this tokenizer using LLaMA’s Tokenizer or did you train it yourself?
Hi, Is it possible to convert this weights into https://github.com/facebookresearch/llama/tree/main/llama format?
How to train the model using TPUs?
Hi~ Great Work! I notice the Chinese version of README file seems to be outdated, e.g., hf space, news, and the missed intermediate checkpoint download link. So I tried to...
Hi, I found the following strange phenomena when running tiny llama pretraining. 1. When using multiple GPUs, I got **completely different results** when **running the same code twice**. Further, many...
I try to load the model with transformers, ` small_model = AutoModelForCausalLM.from_pretrained(approx_model_name, torch_dtype=torch.float16, device_map="auto", trust_remote_code=True)` but error occurs, `OSError: Unable to load weights from pytorch checkpoint file for '/mnt/data3/lyk/models/tinyllama-1.1b/pytorch_model.bin' at...
模型和代码文件欢迎发布到wisemodel.cn开源社区
I used `TinyLlama-1.1B-intermediate-step-1431k-3T` for conversation under the FastChat framework. I asked a question: "What's your name?" The answer I got is: ```bash python -m fastchat.serve.cli --model-path $my_path_to_tiny_llama/tiny_llama/TinyLlama-1.1B-intermediate-step-1431k-3T/ : What's your...