code-llama-for-vscode
code-llama-for-vscode copied to clipboard
Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.
I download the codellama-7B, continue config.json config as this: {"title": "LocalServer", "provider": "openai", "model": "codellama-7b-Instruct", "apiBase": "http://localhost:8000/v1/"} Then I run the llamacpp_mock_api.py , codeLlama can run rightly in my computer...
Hey, The link for the edited config file doesnt work. and you update it? or just upload a config file as an example? thank you
I saw that you mock `llama.cpp` but I still have gpu resources, event I also have enough cpu & RAM. Just want to figure out it's right scene to deploying.
As title this repository missing an official `requirements.txt` to guide developer to install dependencies. Will it come up later?
torchrun --nproc_per_node 1 llamacpp_mock_api.py \ --ckpt_dir CodeLlama-7b-Instruct/ \ --tokenizer_path CodeLlama-7b-Instruct/tokenizer.model \ --max_seq_len 128 --max_batch_size 4 > initializing model parallel with size 1 > initializing ddp with size 1 > initializing...
It will return `Error: special tags are not allowed as part of the prompt.` as an error. No settings adjusted, completely fresh instance of vscode and the continue extension
I followed your instructions and managed to fulfill the prerequisites of downloading and running CodeLlama using Meta's repo. Trying to run the command you provided: ``` [my userpath]/codellama$ torchrun --nproc_per_node...
When run 13b version? I add a function seems like: ``` def run_text_completion(prompts): geneartor.text_completion(...) ``` It will be in loop before `self.geneaotor` in llama.geneartor method, and use `geneartor.chat_completion` it will...
Hi I thought about how to implement the streaming functionality and saw that the only way was to re-write the generation functions in codellama, which seemed a bit messy. Simulaneously,...