Yuxian Gu

Results 80 comments of Yuxian Gu

The transformer module should be manually updated because we have several modifications on it. We will consider updating it to the latest version recently.

我们没有在 windows 上面测试过。但是如果做了 docker 应该可以运行

感谢您的关注!您可以先确定一下 torch 是否能在 docker 中正常使用,是否能使用 GPU,docker 是否有 ssh 服务。

正常运行最终的输出界面如下,最后会有输入的提示符: ![image](https://user-images.githubusercontent.com/38183190/133404907-14dd9dce-dbee-4354-9a01-cc0ed203a7ca.png)

另外,麻烦之后提 issue 到 EVA 原本的目录提,https://github.com/thu-coai/EVA, 那里的 issue 回复较快,版本也会第一时间更新,谢谢!

It seems abnormal to get negative losses. + pg_loss and reward have opposite signs (see [this function](https://github.com/microsoft/LMOps/blob/aa2b4680c60108d46101b3fb1eeff90f0f94e2d6/minillm/minillm/losses.py#L58)), where the reward equals log p which is negative. Therefore, pg_loss should be...

`inf_mask` ensures that when the output distribution (probs) has zero values (like using top-p), `logprobs` will be assigned `-float("inf")` instead of `nan`. You can try commenting out [this line](https://github.com/microsoft/LMOps/blob/e12e572665c8deb6b4ac3d452ea8a68fb0af0ffa/minillm/minillm/utils.py#L57) to...

We didn't perform data filtering for the corpus. We construct the data by 1. Combine these sources. 2. Shuffle the documents. 3. Tokenize them into chunks with 512 tokens. 4....

Is the environment `distil` activated with `conda activate distil`? Can you import `deepspeed` in the interactive environment after simply running `python3`?