Philipp Emanuel Weidmann comments

Results 304 comments of


                                            Philipp Emanuel Weidmann

DRY: A modern repetition penalty that reliably prevents looping

@Hunterius8 Could you quantify that? What is your tokens/s with and without DRY? On my dev machine, I'm seeing 4.99 tokens/s with DRY and 4.98 tokens/s without it. I'm running...

DRY: A modern repetition penalty that reliably prevents looping

@Hunterius8 I see, that's a lot more context than I've ever run, combined with a pretty high base performance, so this is probably the reason I don't notice it in...

DRY: A modern repetition penalty that reliably prevents looping

> Make it a LogitsProcessor like other repetition penalties That means losing control over DRY's position in the sampler stack, right? I think it can be valuable to be able...

DRY: A modern repetition penalty that reliably prevents looping

@Priestru > Also is it possible to add smth like a vocabulary of phrases and words that we want to have penalized right off the bat? I plan to implement...

DRY: A modern repetition penalty that reliably prevents looping

@l3utterfly is porting DRY to llama.cpp: https://github.com/ggerganov/llama.cpp/pull/6839

DRY: A modern repetition penalty that reliably prevents looping

@oobabooga Could you give me a hint on how to proceed here? Do you plan to merge this PR? If so, what are the remaining steps?

Penalty threshold: A mechanism for improving repetition penalties

@ggerganov To me, the repetition penalty is the single most important sampling parameter. Every model I've ever used repeats itself without it. Just recently, I accidentally ran Mixtral-8x7b (currently the...

Penalty threshold: A mechanism for improving repetition penalties

@oobabooga > My first impression is that the parameter is very sensitive. Probably 3 decimal places are needed to find something optimal for a given situation. Optimal, maybe. But beneficial,...

Penalty threshold: A mechanism for improving repetition penalties

@oobabooga I like your idea, and I can see how in many cases, it would improve the range of values that make sense. The reason I don't think it's a...

Penalty threshold: A mechanism for improving repetition penalties

@jukofyork > How do you think this method would work with coding models? If code in the language that is to be generated is already present in the context, the...