Bo仔很忙 issues

Results 15 issues of


                                            Bo仔很忙

去除padding部分，以及最后计算acc

我看苏神的代码里面有去除padding部分，还有就是最后计算acc，我看是除以y_pred.sum()，其实你计算的就是precision吧，感觉都是实体维度的，acc没必要了吧

离散型变量调用bin_data_split

离散型变量为什么要使用bin_data_split，因为bin_data_split中如果np.unique(var)8怎么办？就按照百分位做了吗？

有监督部分的理解

请教一下，我看了simcse原作者代码，感觉有监督部分就是和sentence bert的multiNegativeRankingLoss基本一致，是这么理解吗

不少issue是关于预训练模型加载出错的，包含报warning和config参数不对，解释如下 - 原因：报warning是因为模型文件中的key和bert4torch的key没有完全对齐，config参数不对是笔者对原config文件做了修改（方便参数名统一） - 解决方案：可以直接查看[README文件结尾](https://github.com/Tongjilibo/bert4torch/blob/master/README.md)，部分预训练权重提供了[convert文件](https://github.com/Tongjilibo/bert4torch/tree/master/examples/convert_script)，config参数提供了[config说明](https://github.com/Tongjilibo/bert4torch/blob/master/examples/convert_script/PLM_config.md)

documentation

A simple implement with one python file on single gpu

I have implement a simple hacked llama with one python file and consume 14g VRAM in fp16 mode, you can try here: https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_llama.py

复刻了一个推理的脚本，int8量化后单卡8g显存可跑

复刻了最近出的几个大模型，int8量化后大概是8g左右显存，单卡可以跑起来，[belle](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_belle.py)，[chatglm](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_chatglm.py), [llama](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_llama.py)

参考chatglm的int8来低成本部署moss

参考chatglm-6b的[moss的int8量化部署](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_moss.py)，单卡最低占用约18个G，此外也有转chatglm-6b、bella、llama-7b的推理(含量化版本，单卡12G可跑)及微调，见[bert4torch](https://github.com/Tongjilibo/bert4torch/blob/master/examples/llm/README.md)

[BUG/Help] <请教下提供的量化脚本和hf的load_in_8bit有什么区别>

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 请教下提供的量化脚本和hf的load_in_8bit有什么区别 ### Expected Behavior _No response_ ### Steps To Reproduce 请教下提供的量化脚本和hf的load_in_8bit有什么区别...

[BUG/Help] <rotary相对位置编码和chatglm v1有区别吗>

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 看了下代码实现，感觉有chatglm v1的相对位置编码实现方式不同，请教下区别在哪里 ### Expected Behavior _No response_ ### Steps To Reproduce...

想调用未内置的模型的请进

首先感谢并且欢迎给出宝贵建议的大佬们~，有的issue是关于想用的预训练模型没有内置，解释如下 - **原因**：有的模型的影响力没有那么大，个人开发精力有限，只能先维护好常用模型 - **建议方案**：bert4torch是支持加载transformers的模型的，此时bert4torch只是相当于一个trainer，具体的网络结构还是在transformers中，但是训练过程和callback的调用和原来一致，使用教程如下[tutorials_load_transformers_model.py](https://github.com/Tongjilibo/bert4torch/blob/master/examples/tutorials/tutorials_load_transformers_model.py) - **完美方案**：可以尝试自己用bert4torch实现，然后提pull request成为contributor，一起为社区做贡献

documentation