BELLE issues

Results 163 BELLE issues

Sort by recently updated

LLaMA-7B-EXT

请问公开的LLaMA-7B-EXT是扩展词表后二次与训练的base model吗？会公布基于下面数据集finetune之后的模型吗？ ``` LLaMA-7B-EXT | zh(alpaca-3.5&4) + sharegpt + BELLE-0.5M-CLEAN | 0.762 ```

qingswu

谁能帮忙解决呀，都是用的官方的docker;File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/zero/stage3.py", line 133, in init self.dtype = self.optimizer.param_groups[0]['params'][0].dtype

listwebit

源码下载执行 sh training_scripts/single_node/run_LoRA.sh 报错如下： len(train_dataloader) = 334 len(train_dataset) = 1000 args.per_device_train_batch_size = 1 len(eval_dataloader) = 334 len(eval_dataset) = 1000 args.per_device_eval_batch_size = 1 [2023-04-23 11:34:49,179] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info:...

listwebit

BELLE-LLaMA-EXT-7B和BELLE-on-Open-Datasets的问题

您好：测试了一下这两个新的7B模型，发现各自存在一些问题： 1. BELLE-on-Open-Datasets 在中文指令下，会比较高频地乱入一些英文，同样的prompt下BELLE-7B-2M并没有这样的问题； 2. BELLE-LLaMA-EXT-7B模型的指令模版似乎不是"Human: {instruction} \n\nAssistant: "， `prompt = "Human: 写一首中文歌曲，赞美大自然 \n\nAssistant: " input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) generate_ids = model.generate(input_ids, max_new_tokens=300, do_sample = True, top_k = 30,...

shaomai00

brainstorming vs generation

对给定测试集合中的brainstorming和generation的划分比较迷惑，请问一下，在划分这两个类别时候的主要依据是什么？

makai281

LLaMA7B增量预训练

请问会开源LLaMA7B增量预训练的代码嘛？预计什么时候开源？

XiaoYee

数据质量

请教下如何评估微调数据的质量及数据多样性？

itboyspg

词表

冒昧的问一下，词表是怎么扩充的？是用新的语料得出vocab，与原始得vocab拼接，然后对拼接的部分进行训练吗？

Wzhsgsg

使用CUDA_VISIBLE_DEVICES环境变量和bloom_inference.py 无法实现双卡推理

执行shell 命令： `CUDA_VISIBLE_DEVICES=0,1 python bloom_inference.py BELLE-7B-2M --text "hello"` 提示： `torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 23.69 GiB total capacity; 22.83 GiB already allocated; 162.94...

blizzardwj

开源的BELLE/train/main.py只支持指令跟随数据集，不支持类似sharegpt多轮对话，不能复现论文。

你好，https://github.com/LianjiaTech/BELLE/blob/main/train/reproduce_our_papers/Towards%20Better%20Instruction%20Following%20Language%20Models%20for%20Chinese:%20Investigating%20the%20Impact%20of%20Training%20Data%20and%20Evaluation.md 里面提到可以使用 https://github.com/LianjiaTech/BELLE/blob/main/train/README.md 里面的main.py进行复现。 ![image](https://user-images.githubusercontent.com/233871/233967018-e0873333-2231-4685-8e49-f2393f5a81ac.png) 我看了BELLE\train\utils\data\raw_datasets.py文件中，对数据集的处理方式只有指令跟随。 ![image](https://user-images.githubusercontent.com/233871/233967552-c20c8cf3-7740-402f-b7be-45b23fc14b1b.png) 没有对上述对话的处理方式。想问下多轮对话的数据处理方式是什么？

baibaiw5

BELLE
BELLE copied to clipboard

Metadata

LLaMA-7B-EXT

谁能帮忙解决呀，都是用的官方的docker;File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/zero/stage3.py", line 133, in init self.dtype = self.optimizer.param_groups[0]['params'][0].dtype

哪个大佬救救孩子吧，这个问题好几天了，都没有解决

BELLE-LLaMA-EXT-7B和BELLE-on-Open-Datasets的问题

brainstorming vs generation

LLaMA7B增量预训练

数据质量

词表

使用CUDA_VISIBLE_DEVICES环境变量和bloom_inference.py 无法实现双卡推理

开源的BELLE/train/main.py只支持指令跟随数据集，不支持类似sharegpt多轮对话，不能复现论文。

← Metadata

Owner

Metadata

BELLE BELLE copied to clipboard

Metadata

← Metadata

Owner

Metadata

BELLE
BELLE copied to clipboard