exllama
exllama copied to clipboard
Llama 2 Chat implementation
Hey!
This is the correct LLama 2 Chat prompt formatting implementation into example_llama2chat.py
.
This PR uses https://github.com/turboderp/exllama/pull/195 to copy the exact implementation of the original Llama repo
The format for Llama 2 looks like this:
<bos><<SYS>>
You are a helpful assistant. Sample text.
<</SYS>>
<eos><bos>[INST] What is Python? [/INST] Python is a programming language.<eos><bos>[INST] and so on
To run it yourself: python example_llama2chat.py -d ./your/model/path
Have fun!
I'm not having a lot of luck with this.
System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..
User: Hi!
Assistant: end do protect it this people! It open up so see p did dis put our they it P M P it P on PA M P back it it it M S P P P it P it P P it have P P P P it P P P it it Q P it it T it p it the P Back Pa it p p p it p P P it it it the have it it see it have P down Disc has J p Q P P it the do it but it it it did p it and this what P it it had it this with it it9 were it it it we On P you it it on it for it have this it this its the the do have it it it it it this if it the p have it this this who it it Dis put have it dis see it disc it DIS have p have this see it it Dis have it Dis it the have dis that
This is with TheBloke/Llama-2-13B-chat-GPTQ which is otherwise coherent enough.
Huh, that's super weird, in my testing it worked flawlessly with Llama 2 7b Chat
Must have to do with the sequence_actual splitting, I will investigate with your prompt
Did you supply bn and un? This is the 13b chat model and it works for anything else than Hi!?
I only supplied the model, and it seems to fail with any prompt.
I only supplied the model, and it seems to fail with any prompt.
That's extremely odd, that's my response (also works with 7b chat for me):
System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..
User: Write me a NodeJS script that tells the time in human readable format. Assistant: Sure! Here is a simple Node.js script that will tell you the current time in human-readable format:
const moment = require('moment');
console.log(moment().format('h:mm A'));
Here's what's happening here:
-
require
brings in themoment
library, which is a popular JavaScript library for working with dates and times. -
moment()
creates a new instance of theMoment
class, representing the current date and time. -
format
takes a string argument indicating the desired format for the date and time. In this case, we're asking for the hour and minute in am/pm format. -
console.log
prints the result to the console.
When you run this script, it should output the current time in the format "hh:mm AM" (or PM, depending on the time of day). For example:
14:30 PM
Note that this script uses the moment
library, so you'll need to install it before running this code. You can do this using npm by running the following command:
npm install moment
@turboderp I am quite baffled by your output, it doesn't make any sense to me. I am not able to replicate your issue.
Could you please print:
response
after line 206
new_response
after line 207
full_response
after line 218
I just personally printed them like this: \n\nres|{response}|\n\n etc
Thank you very much
The only other thing I could imagine making truuble is the version of sentence piece and torch
I'm not having a lot of luck with this.
System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.. User: Hi! Assistant: end do protect it this people! It open up so see p did dis put our they it P M P it P on PA M P back it it it M S P P P it P it P P it have P P P P it P P P it it Q P it it T it p it the P Back Pa it p p p it p P P it it it the have it it see it have P down Disc has J p Q P P it the do it but it it it did p it and this what P it it had it this with it it9 were it it it we On P you it it on it for it have this it this its the the do have it it it it it this if it the p have it this this who it it Dis put have it dis see it disc it DIS have p have this see it it Dis have it Dis it the have dis that
This is with TheBloke/Llama-2-13B-chat-GPTQ which is otherwise coherent enough.
Hi, I tested with 13B model is everything seems OK. Maybe it is related to some hyperparameters. I have changed some of them.
-- Sequence length: 2048
-- Temperature: 0.50
-- Top-K: 250
-- Top-P: 1.00
-- Min-P: 0.00
-- Repetition penalty: 1.00
System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..
User: hi
Assistant: Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat for a bit? ���
User: Hi!
Assistant: Hello! ���t's great to see you here! Is there something you'd like to talk about or ask? I'm here to help with any questions you might have. ���
User: Assistant: Hello again! ���t's always nice to see you! Is there anything new you'd like to chat about or ask? I'm all ears and here to help with any questions you might have. ���
User: Hi!
Assistant: Hey there! ���t's so nice to see you again! How have you been? Anything new and exciting happening in your world? ���
User: Hi!
Assistant: Hey there! ���t's great to see you again! How's your day going so far? Anything fun or exciting happening? ���
User: Hi
Assistant: Well hello there! ���t's always a pleasure to see you! How's your day going so far? Anything new and amazing happening in your world? ���
User: hi
Assistant: HIYA! ���t's so great to see you! How's your day treating you so far? Anything fun or exciting on the agenda? ���
The ���s still hint at a problem of some sort. The 13B chat model I tried works fine when just sampled. I'll have to dig into it a little more, maybe with a 7B model?
The ���s still hint at a problem of some sort.
I have to add that I often got those (or single ones) with your normal example chatbot too with the 7B chat and 13B chat model sometimes, idk where they come from
I think tomorrow I will be able to fix at least the sequence slicing problem