Shay Duane issues

Results 8 issues of


                                            Shay Duane

Dockerfile增加对newbing的依赖安装命令

请在dockerfile文件中增加newbing依赖安装的命令

使用Dockerfile+ChatGLM编译镜像运行，没法加载config_private.py里newbing的cookie，导致newbing没法使用

chatglm运行时报错UnboundLocalError: local variable 'response' referenced before assignment

- **(1) Describe the bug 简述** 使用docker编译Dockerfile+ChatGLM形成镜像运行，chatglm本地模型在开始时是正常的，几次问答之后就会卡住，然后终端会报错 - **(2) Screen Shot 截图** ![image](https://user-images.githubusercontent.com/78648337/234608538-48ec3fbd-0cf9-4109-8417-dd09b6d81e79.png) - **(3) Terminal Traceback 终端traceback（如有）** Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:08

bug

[BUG] Mixtral inference OOM

**Describe the bug** I'm not sure if DeepSpeed needs to be adapted for Mixtral. When I tried using DeepSpeed inference for model inference, it didn't properly implement model parallelism. Instead,...

bug

inference

安装完成后，代理不成功

按照文档安装完成后，将链接导入v2rayN客户端中，代理失败，延迟超时。用的vultr美国服务器，对应端口已经开放，linux防火墙已关，ip未被封，本地可以ping通。使用ws+tsl+cdn时，也无法代理，下面是v2rayN输出的日志： app/proxyman/outbound: failed to process outbound traffic > proxy/vmess/outbound: failed to find an available destination > common/retry: [transport/internet/websocket: failed to dial WebSocket > transport/internet/websocket: failed to dial to...

导入 fengshen module 后trainer的进度不显示了

原先tranformers官方的训练脚本，如果导入fengshen module 那trainer训练的的时候tqdm的进度就会不显示，取消导入就恢复了 from fengshen import LongformerForTokenClassification, LongformerConfig 即使不用，只是导入，tqdm的进度仍然会不显示

由于target mask不在model输入参数里，在传入自定义collator会被transformers移除掉，导致自定义collator缺少target mask报错无法mask掉instruction

![WeChat0a41890022b14c5cf2cc36b93c552f58](https://github.com/yangjianxin1/Firefly/assets/78648337/56280e96-74f1-438d-a4bd-17e067e5daad) ![WeChatca64a0b02f2dc31e25ec33cb6ce04b31](https://github.com/yangjianxin1/Firefly/assets/78648337/c7457779-2002-4575-a365-8f61f348d4e9)

SLURM cannot achieve cross-node parallelism

I have a SLURM cluster with 50 nodes, each node having 96 CPU cores. I want to execute a job on the cluster, and the job is divided into 192...