FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

GPTNeox decoding argumentation

Open w775739733 opened this issue 1 year ago • 3 comments

Hello! I am using GPTNeox for decoding, and I need to pass in some parameters during forward propagation, such as length penalty and so on. I also use Transformers to call model.generate() for generation. When I use the same set of parameters, the results of the two are not consistent. I would like to know whether there are relevant documents describing the specific meaning and range of these parameters? Are they exactly the same as the meaning and scope of the parameters in Transformers?

w775739733 avatar Jul 10 '23 14:07 w775739733

The args: "gen_kwargs": { "max_new_tokens": 2048, "num_beams": 1, #"early_stopping": True, "do_sample": False, "temperature": 0.9, #0.35, "logits_processor": null, "top_k": 40, "repetition_penalty": 1.01, "length_penalty": 1.0, "eos_token_id": [] },

w775739733 avatar Jul 10 '23 14:07 w775739733

"repetition_penalty": 1.01 have bug

RobotGF avatar Jul 22 '23 04:07 RobotGF

The logic of repetition_penalty in FT is not same with OPENAI description, How to use it ? OpenAI: https://platform.openai.com/docs/guides/gpt/managing-tokens mu[j] -> mu[j] - c[j] * alpha_frequency - float(c[j] > 0) * alpha_presence

hezeli123 avatar Nov 01 '23 02:11 hezeli123