Mingming-Yin issues

Results 10 issues of


                                            Mingming-Yin

How to test multi-reference text?

For example, there are 4 references for each article in DUC2004 dataset . How to test the rouge score for this ?

enhancement

help wanted

Why is the negative leaf_value for update_recursive function?

https://github.com/junxiaosong/AlphaZero_Gomoku/blob/68603c0d8e5a0ef9273bacc7d281abe27493da1b/mcts_alphaZero.py#L137 I can't absolutely understant the negative leaf_value, which is different with in the paper(AlphaGo Zero) Could you give a explaination for this? Thank you very much .

What is the performance on WMT'14 ENDE datasets ?

Hi, Could you reproduce the results on WMT'14 datasets of "Attention is All You Need" paper ? I want to know the exact BLEU scores of your systerm on WMT'14...

大佬抽空写个README？

看不懂怎么使用，请大佬抽个时间写个README吧

Does support multi-gpus training ?

This is a great repo. Can this code support multi-GPU training? I wonder if it can achieve the same performance as tensor2tensor on wmt14-en-de corpus. Thanks.

Has anyone tried different learning rates to assess their impact on experimental results?

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction During the process of full parameter fine-tuning, has there been an attempt to examine...

pending

What should I do to confirm flash-attn information during the training process ?

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction #1882 When using the qwen model, flash-attn is automatically enabled. How can I confirm...

pending

The fine-tuned Gemma model encounters an error when loaded through vllm: ```KeyError: 'lm_head.weight'```

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction deepspeed --include="localhost:0,1,2,3,4,5,6,7" src/train_bash.py \ --stage sft \ --do_train \ --model_name_or_path gemma-7b \ --dataset XXX.json...

pending

RuntimeError: Already borrowed

Refer to the following method to use the m3 model, an error will occur when this service is called in parallel. ``` from FlagEmbedding import BGEM3FlagModel model = BGEM3FlagModel('BAAI/bge-m3', use_fp16=True)...

加载问答对数据进行问答

请教一个问题：我们的数据库中存库的是问答对：pair，构建document的时候我只想对question进行。通过embedding检索到question之后取出对应的answer再进行后续的处理。请问这种数据应该如何加载啊？