FasterTransformer GPTNeox decoding argumentation

GPTNeox decoding argumentation

Open w775739733 opened this issue 1 year ago • 3 comments

Hello! I am using GPTNeox for decoding, and I need to pass in some parameters during forward propagation, such as length penalty and so on. I also use Transformers to call model.generate() for generation. When I use the same set of parameters, the results of the two are not consistent. I would like to know whether there are relevant documents describing the specific meaning and range of these parameters? Are they exactly the same as the meaning and scope of the parameters in Transformers?

Jul 10 '23 14:07 w775739733

The args: "gen_kwargs": { "max_new_tokens": 2048, "num_beams": 1, #"early_stopping": True, "do_sample": False, "temperature": 0.9, #0.35, "logits_processor": null, "top_k": 40, "repetition_penalty": 1.01, "length_penalty": 1.0, "eos_token_id": [] },

Jul 10 '23 14:07 w775739733

"repetition_penalty": 1.01 have bug

Jul 22 '23 04:07 RobotGF

The logic of repetition_penalty in FT is not same with OPENAI description, How to use it ? OpenAI: https://platform.openai.com/docs/guides/gpt/managing-tokens mu[j] -> mu[j] - c[j] * alpha_frequency - float(c[j] > 0) * alpha_presence

Nov 01 '23 02:11 hezeli123

FasterTransformer FasterTransformer copied to clipboard

GPTNeox decoding argumentation

FasterTransformer
FasterTransformer copied to clipboard