wlhgtc comments

Results 27 comments of


                                            wlhgtc

Bug in instruction_following_eval

And by the way: These two methods((`removeprefix` and `removesuffix`)) were introduced in Python 3.9. [Code](https://github.com/google-research/google-research/blob/master/instruction_following_eval/instructions.py#L460) will not work in older version. Maybe you should declare it in the readme?

[Feature Request] allow list of messages as system_messages

This is also what I need. Do you have a development plan for this task recently?

一点小问题

@liyucheng09

一点小问题

> 了解，我这两天再写篇文章，解释一下用 LLMs 做压缩的流程。在里边会介绍一下代码实现突然想起来了：这个累加的区间应该就是 Arithmetic Coding 要的那个概率区间

Reproducing of Lora Model Result on MT-Bench

> Perhaps related, I compared two full DPO-trained checkpoints [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) and [alignment-handbook/zephyr-7b-dpo-full](https://huggingface.co/alignment-handbook/zephyr-7b-dpo-full). The MT-bench results seem to be different as well. [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) match the results in the paper, and [alignment-handbook/zephyr-7b-dpo-full](https://huggingface.co/alignment-handbook/zephyr-7b-dpo-full)...

Reproducing of Lora Model Result on MT-Bench

> Thanks for all your questions and detailed analysis, there are a number of different things to address here. > > ### LoRA training: > The official `zephyr-7b-beta` model used...

[RFC][V1] `LogitsProcessor` interface

If I want to force stop thinking in an R1-like model (e.g., when `prompt + outputs > 8192`, force generate ``) how can I get `prompt_length` for single request? Thank...