Terry Yue Zhuo

Results 50 comments of Terry Yue Zhuo

@ArthurZucker When generating with `legacy=False`, I noticed that some samples couldn't be generated correctly: ![image](https://github.com/huggingface/transformers/assets/36221214/df7504ce-8071-4373-a79f-8a39371fd48a) There should be some other issues, I guess?

The first few prompts sent to the vLLM calls: [yi_vllm.txt](https://github.com/user-attachments/files/16163872/yi.txt). Looks correct, but outputs are bad. Do you also need inputs for the `transformers` calls to see why some outputs...

Here are the top 10 pairs of prompts and token ids! [yi_hf.txt](https://github.com/user-attachments/files/16165333/yi_hf.txt)

Hi @itazap, are there any specific files you want to see? Or just the ones where the model degraded? If that's the case, there were plenty of them in the...

@itazap Unfortunately `legacy=False` hasn't been fixed for the model degradation. While I noticed that the extra spaces were somehow removed, new issues shown in https://github.com/huggingface/transformers/issues/31890#issuecomment-2220650597 were unforeseen. A lot of...

@itazap @ArthurZucker Update: ```python EOS = [ "", "", "", "\nif __name__", "\ndef main(", "\nprint(", ] stop_sequencer = StopSequencer( self.model, model_type="causal", # or seq2seq tokenizer=self.tokenizer, ) model = stop_sequencer.register_stop_texts( stop_texts=self.eos,...

Oh, it's a further investigation on https://github.com/huggingface/transformers/issues/31890#issuecomment-2220650597. I removed the part of `stop_sequencer` and checked the ideal outputs.

@itazap @ArthurZucker Is there going to be a PR to make the correct setup as default? 👀 And as VLLM possibly has a different implementation for the tokenizer, should we...

Hi @KedarnathKC, thanks for reporting this issue! It'd be great if you can submit a PR :-)

@Slimshilin Thanks for the review. Everything should be fixed now :)