alpaca-lora
alpaca-lora copied to clipboard
Why is the result the same every time?
Why is the result the same every time?
Low temperature, probably.
Beyond the system message, the temperature and max tokens are two of many options developers have to influence the output of the chat models. For temperature, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. In the case of max tokens, if you want to limit a response to a certain length, max tokens can be set to an arbitrary number. This may cause issues for example if you set the max tokens value to 5 since the output will be cut-off and the result will not make sense to users. We generally recommend altering this or top_p but not both. guides/chat/introduction
temperature number Optional Defaults to 1
What sampling temperature to use, between 0 and 2.
Higher values like 0.8 will make the output more random,
while lower values like 0.2 will make it more focused and deterministic.
Check your settings. The temperature and top_p settings control how deterministic the model is in generating a response. If you're asking it for a response where there's only one right answer, then you'd want to set these lower. If you're looking for more diverse responses, then you might want to set them higher. The number one mistake people use with these settings is assuming that they're "cleverness" or "creativity" controls. guides/completion/inserting-text
top_p number Optional Defaults to 1
An alternative to sampling with temperature, called nucleus sampling,
where the model considers the results of the tokens with top_p probability mass.
So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both.
You should recommend to modify either of Top_P
or Temperature
The default setting of Openai Playground's practice is
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.Completion.create(
model="text-davinci-003",
prompt="Summarize this for a second-grade student:\n\nJupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.",
temperature=0.7,
max_tokens=256,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
I suggest trying out this config, it's what was used on the original alpaca repo and it works well for me personally.
generation_config = GenerationConfig( temperature=0.7, top_p=0.9, num_beams=1, max_new_tokens=600, do_sample=True, )
why generating repetition?
The most likely reason is the call to generate in generate.py doesn’t have do_sample=True so it is being greedy.
most likely reason is the call to generate in generate.py doesn’t have do_sample=
where to add "do_sample=True"?