BELLE icon indicating copy to clipboard operation
BELLE copied to clipboard

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Results 163 BELLE issues
Sort by recently updated
recently updated
newest added

line 172 of BELLE/1.5M/zh_seed_tasks.json "给定一个主题,基于这个主题写一篇作为,要求立意清晰,思想积极向上。。" It should be "作文", not "作为".

``` deepspeed --num_gpus=1 finetune.py --model_config_file run_config/Llama_config.json --deepspeed run_config/deepspeed_config.json ```

作者您好,由于显存过小,我希望使用量化后的模型做微调。我下载了LELLE-7B-gptq,我该如何配置Bloom_config.json?期待答复,谢谢。

大神们好。我运行Bloom模型,并修改fp16改为False,然后报以下错: ``` Traceback (most recent call last): File "finetune.py", line 236, in train(args) File "finetune.py", line 214, in train trainer.train(resume_from_checkpoint = args.resume_from_checkpoint) File "/root/anaconda3/envs/Belle/lib/python3.8/site-packages/transformers/trainer.py", line 1662, in train return inner_training_loop(...

我这是运行的gptq,md文档里的推理脚本CUDA_VISIBLE_DEVICES=0 python bloom_inference.py bloom --wbits 4 --groupsize 128 --load bloom/bloom7b-0.2m-8bit-128g.pt --text "hello"

BELLE-7B(bloom)量化后,推理速度显著降低。 BELLE-7B(LLaMA)量化后,推理速度也下降了一部分。 代码: ``` import time import torch import torch.nn as nn from gptq import * from modelutils import * from quant import * from transformers import AutoTokenizer from random...

复刻了最近出的几个大模型,int8量化后大概是8g左右显存,单卡可以跑起来,[belle](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_belle.py),[chatglm](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_chatglm.py), [llama](https://github.com/Tongjilibo/bert4torch/blob/master/examples/basic/basic_language_model_llama.py)

支持在原版llama上做增量预训练吗?

如题,我的环境是 python 3.9+ pytorch 1.9.0 + cuda11.2加载4bits的模型,但是我通过setup_cuda.py生成的quant_cuda.cpp貌似不能正常的import。这个文件是放在gptq那个文件夹的根目录底下吗?