Suraj Subramanian
Suraj Subramanian
@jiaohuix - thanks for your work! i can help you write a blog about this for the wider community to learn. please email me subramen-at-meta-dot-com if you're interested!
there's a finetuning script at https://github.com/facebookresearch/llama-recipes/blob/main/llama_finetuning.py which you could adapt for pretraining. Section 2 of the paper (https://arxiv.org/pdf/2307.09288.pdf) has the hyperparams used for pretraining
Hello @Vatsal1106Virani take a look at https://github.com/facebookresearch/llama-recipes/tree/main/demo_apps which has many code samples to get you started
The error seems to be occurring because the checkpoint_dir has spaces that haven't been escaped `AssertionError: no checkpoint files found in D:\Coding\Environment\Llama` Maybe try with ` --ckpt_dir "D:\Coding\Environment\Llama 2\llama-main\llama-2-13b"` or...
The default MP sharding for llama-2-70b-chat is 8, so you shouldn't be facing this error. llama-2-13b has an MP=2... is it possible that you are accidentally using that model instead?...
Closing this issue, @bhargavanubavam feel free to reopen when you have more information
Serving frameworks like vLLM or TGI are mainly optimized for GPU usage, but they also have support for parallel inference and memory management that might be useful. Some examples [here](https://github.com/facebookresearch/llama-recipes/blob/main/demo_apps/llama-on-prem.md)
Hi @dnatarajan00! Using `total_len = min(params.max_seq_len, max_gen_len + min_prompt_len)` appears to be semantically more correct, but note that it **reduces the effective `max_gen_len` for longer prompts**. In your example, the...
Hi @sgsharma2000 firstly thank you for submitting this (fairly large) PR! Reviewing your proposed changes will be much easier if you could batch your changes across multiple smaller and narrowly-scoped...
@samuelselvan - can you take a look, please?