alpaca-lora icon indicating copy to clipboard operation
alpaca-lora copied to clipboard

Why is the result the same every time?

Open cxj01 opened this issue 1 year ago • 6 comments

Why is the result the same every time?

cxj01 avatar Mar 17 '23 02:03 cxj01

Low temperature, probably.

tloen avatar Mar 17 '23 02:03 tloen

Beyond the system message, the temperature and max tokens are two of many options developers have to influence the output of the chat models. For temperature, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. In the case of max tokens, if you want to limit a response to a certain length, max tokens can be set to an arbitrary number. This may cause issues for example if you set the max tokens value to 5 since the output will be cut-off and the result will not make sense to users. We generally recommend altering this or top_p but not both. guides/chat/introduction

temperature  number  Optional  Defaults to 1
What sampling temperature to use, between 0 and 2. 
Higher values like 0.8 will make the output more random, 
while lower values like 0.2 will make it more focused and deterministic.

Check your settings. The temperature and top_p settings control how deterministic the model is in generating a response. If you're asking it for a response where there's only one right answer, then you'd want to set these lower. If you're looking for more diverse responses, then you might want to set them higher. The number one mistake people use with these settings is assuming that they're "cleverness" or "creativity" controls. guides/completion/inserting-text

top_p  number  Optional  Defaults to 1
An alternative to sampling with temperature, called nucleus sampling, 
where the model considers the results of the tokens with top_p probability mass. 
So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.

You should recommend to modify either of Top_P or Temperature The default setting of Openai Playground's practice is

import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.Completion.create(
  model="text-davinci-003",
  prompt="Summarize this for a second-grade student:\n\nJupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.",
  temperature=0.7,
  max_tokens=256,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0
)

T-Atlas avatar Mar 17 '23 03:03 T-Atlas

I suggest trying out this config, it's what was used on the original alpaca repo and it works well for me personally.

generation_config = GenerationConfig( temperature=0.7, top_p=0.9, num_beams=1, max_new_tokens=600, do_sample=True, )

devilismyfriend avatar Mar 17 '23 05:03 devilismyfriend

image

why generating repetition?

cxj01 avatar Mar 21 '23 06:03 cxj01

The most likely reason is the call to generate in generate.py doesn’t have do_sample=True so it is being greedy.

tanitna avatar Mar 26 '23 01:03 tanitna

most likely reason is the call to generate in generate.py doesn’t have do_sample=

where to add "do_sample=True"?

HANiFLY avatar Apr 16 '23 12:04 HANiFLY