xiaofei comments

Results 14 comments of


                                            xiaofei

A trouble when I run detect.py

The error should be in the script `parse_darknet_yolo2.py`, and its parsing rules may not be used directly for `yolo-voc.weights`. I tested the `model.ckpt` which came from `yolo-voc.weights` in my own...

A trouble when I run detect.py

The error should have been caused by the code `major, minor, revision, seen = struct.unpack('4i', f.read(16))`. The log is `major=0, minor=1, revision=0, seen=7622400` when I parse `yolo.weights`.But the log is...

Different decode results when decode batch_size=1 and >1

@srvinay @xinq2016 I encountered the same problem. When training the model, the parameter `mb_size` (mini-batch size) defaults to 16, but during test, the prediction results will be different if `mb_size`...

[moss-moon-003-sft-plugin-int4] 跑示例中的插件模型代码报错

moss-moon-003-sft-plugin-int8 存在同样的问题，量化的模型有什么特殊依赖吗？

[moss-moon-003-sft-plugin-int4] 跑示例中的插件模型代码报错

这是我的环境，系统是Ubuntu18.04 ``` pip install -r requirements.txt triton Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: triton in /home/admin/miniconda3/envs/moss/lib/python3.8/site-packages (2.0.0) Requirement already satisfied: torch==1.10.1 in /home/admin/miniconda3/envs/moss/lib/python3.8/site-packages (from -r requirements.txt (line 1)) (1.10.1)...

[moss-moon-003-sft-plugin-int4] 跑示例中的插件模型代码报错

same error https://github.com/OpenLMLab/MOSS/issues/107

[moss-moon-003-sft-plugin-int4] 跑示例中的插件模型代码报错

在项目里提供了 https://github.com/OpenLMLab/MOSS/blob/main/utils.py @sun1092469590

SFT loss

The loss of the prompt in the original question, it was excluded in the pre-training stage such as GPT2 or GPT3. Whether to do the same in the SFT stage,...

The loss in reward_model.py

> In our case, it is also a scalar. The vector is from batch dimension instead of seq-length dimension. @yaozhewei Your explanation here should be wrong. There is already a...

The loss in reward_model.py

> > In our case, it is also a scalar. The vector is from batch dimension instead of seq-length dimension. > > Thanks for the reply. I still have confusion....