codellama
codellama copied to clipboard

Published 20 hours ago •

Reame
Issues

CodeLlama went into infinite cycle (of communication)

Open tambetvali opened this issue 1 year ago • 1 comments

I did not really measure, whether it's infinite (I did hit Ctrl+C before infinity), but it suddenly started to repeat itself:

tvali@PC366:~$ ollama run codellama:7b

Is LDM a Deep Learning model?

LDM (Learning-based Decision Making) is a type of decision making model that uses machine learning algorithms to learn from data and make decisions. It is based on the concept of learning and improving over time^C

Is Latent Diffusion Model a Deep Learning Model?

Latent Diffusion Model (LDM) is a type of deep learning model that uses a probabilistic approach to generate images. It is based on the concept of diffusion models, which are statistical models that describe how data points move over time. In the context of LDM, the diffusion process is used to model the generation of images, and the model learns to generate new images by iteratively refining its predictions.

LDM is a type of Generative Adversarial Network (GAN) that uses a probabilistic approach to generate images. It is based on the concept of learning and improving over time, using a process called iterative refinement. The model learns to generate new images by iteratively refining its predictions, and it can be used for various tasks such as image generation, image-to-image translation, and image synthesis.

LDM is a deep learning model that uses a probabilistic approach to generate images. It is based on the concept of diffusion models, which are statistical models that describe how data points move over time. In the context of LDM, the diffusion process is used to model the generation of images, and the model learns to generate new images by iteratively refining its predictions.

LDM is a type of Generative Adversarial Network (GAN) that uses a probabilistic approach to generate images. It is based on the concept of learning and improving over time, using a process called iterative refinement. The model learns to generate new images by iteratively refining its predictions, and it can be used for various tasks such as image generation, image-to-image translation, and image synthesis.

LDM is a deep learning model that uses a probabilistic approach to generate images. It is based on the concept of diffusion models, which are statistical models that describe how data points move over time. In the context of LDM, the diffusion process is used to model the generation of images, and the model learns to generate new images by iteratively refining its predictions.

LDM is a type of Generative Adversarial Network (GAN) that uses a probabilistic approach to generate images. It is based on the concept of learning and improving over time, using a process called iterative refinement. The model learns to generate new images by iteratively refining its predictions, and it can be used for various tasks such as image generation, image-to-image translation, and image synthesis.

LDM is a deep learning model that uses a probabilistic approach to generate images. It is based on the concept of diffusion models, which are statistical models that describe how data points move over time. In the context of LDM, the diffusion process is used to model the generation of images, and the model learns to generate new images by iteratively refining its predictions.

LDM is a type of Generative Adversarial Network (GAN) that uses a probabilistic approach to generate images. It is based on the concept of learning and improving over time, using a process called iterative refinement. The model learns to generate new images by iteratively refining its predictions, and it can be used for various tasks such as image generation, image-to-image translation, and image synthesis.

LDM is a deep learning model that uses a probabilistic approach to generate images. It is based on the concept of diffusion models, which are statistical models that describe how data points move over time. In the context of LDM, the diffusion process is used to model the generation of images, and the model learns to generate new images by iteratively refining its predictions.

LDM is a type of Generative Adversarial Network (GAN) that uses a probabilistic approach to generate images. It is based on the concept of learning and improving over time, using a process called iterative refinement. The model learns to generate new images by iteratively refining its predictions, and it can be used for various tasks such as image generation, image-to-image translation, and image synthesis.

LDM is a deep learning model that uses a probabilistic approach to generate images. It is based on the concept of diffusion models, which are statistical models that describe how data points move over time. In the context of LDM, the diffusion process is used to model the generation of images, and the model learns to generate new images by iteratively refining its predictions.

LDM is a type of Generative Adversarial Network (GAN) that uses a probabilistic approach to generate images. It is based on the concept of learning and improving over time, using a process called iterative refinement. The model learns to generate new images by iteratively refining its predictions, and it can be used for various tasks such as image generation, image-to-image translation, and image synthesis.

LDM is^C

Send a message (/? for help)

It's Ubuntu 22.04, and as I don't know what more could you need, I ran all those commands:

/show info Model details: Family llama Parameter Size 7B Quantization Level Q4_0

/show modelfile

Modelfile generated by "ollama show"

To build a new Modelfile based on this one, replace the FROM line with:

FROM codellama:7b

FROM /usr/share/ollama/.ollama/models/blobs/sha256:3a43f93b78ec50f7c4e4dc8bd1cb3fff5a900e7d574c51a6f7495e48486e0dac TEMPLATE """[INST] <<SYS>>{{ .System }}<</SYS>>

{{ .Prompt }} [/INST] """ PARAMETER rope_frequency_base 1e+06 PARAMETER stop "[INST]" PARAMETER stop "[/INST]" PARAMETER stop "<<SYS>>" PARAMETER stop "<</SYS>>"

/show parameters Model defined parameters: stop "[INST]" stop "[/INST]" stop "<<SYS>>" stop "<</SYS>>" rope_frequency_base 1e+06 Send a message (/? for help)

Feb 23 '24 11:02 tambetvali

I can't provide support for the quantized versions of Code Llama. Can you double-check that you get similar behavior with inference code from this repo and a model retrieved via download.sh?

Feb 28 '24 07:02 jgehring

Sorry I have 5GB left of hard drive, because I am without internet at home and I have to download things ..cannot you simply ask the same question? I don't know how to reproduce, with AI, as well - does it answer the same way every time? I noticed another bug as well, it does not always differentiate between "me" and "you".

Later time, I want to go deeper with CodeLlama - I learn some AI programming and I have downloaded some videos about programming CodeLlama and creating it from scratch as well. I was some time away from programming and now I notice how evolved is AI - I have to learn a lot right now :)

Mar 01 '24 10:03 tambetvali

Can you check whether ollama's codellama:7b is built from the "Code Llama" or the "Code Llama Instruct" model? If it's the former then it won't know out of the box when to stop. For uses-cases like question answering, you would either need to use few-shot prompting or use one of the Instruct models.

Mar 04 '24 13:03 jgehring

How to check it?

I followed these installation instructions:

Install, in VSC, the Continue plugin (it might not matter, but for information)
Install Ollama, probably as instructed on their site - https://ollama.com/download "install with one command"
VSC, after I chose Codellama:7b, instructed to run the command "ollama run codellama:7b"
I ran this command and it downloaded some kind of model, which started to work
This issue is from another run of this same command, "ollama run codellama:7b", which downloads the model if it's not there and runs it in any case.

So, following this, it has to be model described here: https://ollama.com/library/codellama:7b

ID 8fdf8f752f6e Model family llama Parameters 7B Quantization 4-bit Last pushed 4 months ago

It also has this information (https://ollama.com/library/codellama):

API

curl -X POST http://localhost:11434/api/generate -d '{ "model": "codellama", "prompt": "Write me a function that outputs the fibonacci sequence" }'

"model": "codellama" probably means that it's not an instruct model (repeating the words like a monkey, as I really don't know, what's the difference - but it's probably as good as an AI answer).

Mar 04 '24 13:03 tambetvali

Thanks! Yeah that seems to be the base model which is trained purely to continue the sequence you prompt it with. For any Q&A-style or chat scenarios, try the 7b-instruct model.

Mar 04 '24 14:03 jgehring

You can also check the examples in example_completion.py, for which the base models will output sensible completions: https://github.com/facebookresearch/codellama/blob/1af62e1f43db1fa5140fa43cb828465a603a48f3/example_completion.py#L27-L39

Mar 04 '24 14:03 jgehring

I'll close this for now; feel free to re-open if you're still having issues with this.

Mar 13 '24 22:03 jgehring