Guangyao Li
Guangyao Li
Hi @lvermue , I think the equation still has some problem. As each position in `e2_multi` whose label is 1, they will become `1*(1-Config.label_smoothing_epsilon)`, and position in `e2_multi` whose label...
我加载legacy后`LTP(path = "legacy")`, 3kw行的语料分词加tokenize在单进程下只要1个多小时, 不过我一行的语料很短, 长度平均就25.
@mrwyattii I find two small issues which need improvement for qwen-1.5 on DeepSpeed-MII. 1. [There is no bos token in qwen-1.5](https://github.com/QwenLM/Qwen1.5/issues/144), so [this line of code](https://github.com/microsoft/DeepSpeed-MII/blob/main/mii/batching/ragged_batching.py#L220)(i.e., `output_tokens = torch.cat((r.prompt_tokens[1:], output_tokens))`)...
@Robot-2020 Hi~ Recently, I met the same issue as yours. I think the problem is caused by the gcc version. My original gcc version in my lab server machine is...
I meet the similar case. Here is my code: ``` def worker(rank, this_model): try: if this_model is None: client = mii.client('qwen') else: client = this_model response = client.generate(["xxx"], max_new_tokens=1024, stop="",...
> I meet the similar case. Here is my code: > > ``` > def worker(rank, this_model): > try: > if this_model is None: > client = mii.client('qwen') > else:...
> > I meet the similar case. Here is my code: > > ``` > > def worker(rank, this_model): > > try: > > if this_model is None: > >...
@mrwyattii Hi~ Does DeepSpeed-MII support multiple replicas on single node? For example, I have a node with 8 A100 GPUs, and I set the tensor_parallel is 4 and replica_num is...
请问下你们要到链接里的文件了吗
@3waterya 我刚刚在[这个issue](https://github.com/syxu828/Crosslingula-KG-Matching/issues/9)上问到了,我先测试下能不能用