Zhang Peiyuan

Results 32 comments of Zhang Peiyuan

We sample [100 iters](https://github.com/jzhang38/TinyLlama/blob/c53075b679c9a97f96562052689e0043120a5fd5/pretrain/tinyllama.py#L38) of val data for the actual validation. I believe this is caused by the fact that a different partition of val data gets sampled after the...

It actually takes up around 600MB on disk and around 700MB during inference, with activations taken into account (https://huggingface.co/TinyLlama/TinyLlama-1.1B-python-v0.1/blob/main/ggml-model-q4_0.gguf). I will update the readme.

@TapendraBaduwal I recommend you to check https://github.com/OpenAccess-AI-Collective/axolotl

I agree with @RonanKMcGovern here on the effectiveness of sliding window attention (even though I have not done an apple-to-apple comparison.) Would appreciate it if someone could submit a PR...

Closing this issue for now.

> Maybe you can try Mamba, rwkv and StripedHyena architecture? If we have the compute.

Yes, we use the alignment handbook without changing any hyperparam.

https://github.com/EleutherAI/lm-evaluation-harness Probably this is the repo that you should look for.

Follow the instructions at https://github.com/Dao-AILab/flash-attention to install flash attn.