Ant0082 issues

Results 8 issues of


                                            Ant0082

This is a piece of junk code don't look at it anymore！

What is the difference between CPM-Ant+ and CPM-Ant?

question

[BUG]cf challenge fail & Failed to refresh session

**Describe the bug** run cmd with "xvfb-run python -m revChatGPT --debug --text" get error: cf challenge fail & Failed to refresh session **To Reproduce** run cmd: xvfb-run python -m revChatGPT...

bug

[REQUEST] mpu module

Is there an API for the MPU? Where can I find how to use these methods? Do I need to experiment one by one manually? ```python mpu.copy_to_model_parallel_region mpu.gather_from_model_parallel_region ```

enhancement

Why does the model occupy less GPU memory after quantization, but the inference speed is slower?

Using the vector-wise symmetric quantization method.

您好，咨询几个问题。

您好： > 1. PrefixLM、LM训练的相关代码目前已经开源了吗？是pretrain_yuan_13B.sh这个脚本吗？ > 2. 你们提供的API中的dialog是基于那个结构训练的？ > 3. 我们的卡没有那么多，只有4台8卡 32G显存的V100，想复现你们的dialog模型，32张卡的话是不是只会影响Global BS这个大小，不知道如果降低这个参数的大小会不会对最后模型的效果有较大的影响，不知道你们有没有跑过类似的实验。 > 4. 智源他们开源的openBMB可以支持在卡不多的情况下跑一些大模型，如果Follow你们论文中的方式，切换成openBMB架构不知道对效果会不会产生较大影响。 > 5. 阅读你们开源仓库中的issue看到，你们推荐在单张卡上用deepspeed的zero-offload的方式，这种方式是不是可以用几十张卡复现你们的工作（dialog模型），效果不会差太多。 > 6. 还在issue中看到，你们可以提供数据处理代码的部分开源，但是需要申请，请问这个在哪里申请？谢谢。

XLore2服务超时了，怎么申请API使用呀？

请问FAQ toolkit的代码是缺失的。

这段实现有参考代码吗？根据Readme中的描述好像是通过Bert找问题和“Knowledge Explore”阶段存储的文本的相似来搭一个简单的FAQ，构造QA对加在Prompt中。不知道您的实现中是不是也是这样？另外一个问题，像“中午吃什么？”你叫什么名字”这种比较口语话的文本，本身也不包含什么知识信息，对这种句子的回复往往效果不太好，不知道您有没有做些特殊的处理。