BELLE issues

Results 163 BELLE issues

Sort by recently updated

run docker: RuntimeError: Unable to proceed, no GPU resources available

(base) ub2004@ub2004-B85M-A0:~/llm_dev/BELLE/train$ sudo docker run -it belle:v1.0 /bin/bash [sudo] password for ub2004: ============= == PyTorch == ============= NVIDIA Release 22.08 (build 42105213) PyTorch Version 1.13.0a0+d321be6 Container image Copyright (c) 2022,...

SeekPoint

请问下数据生成是用的davinci-003还是turbo-3.5？

syp1997

查看了bloom-7B是基于FP16的参数，模型大小十几个G。为什么belle-7B模型大小来到了二十多个G，是从FP16转移到了FP32吗？

listwebit

在docker环境下，run_LoRa有问题，3张32G的V100也跑不起来，用之前的finetune就可以跑起来

我们模型用的BLoom-2M的，用的docker的环境，用的bash training_scripts/single_node/run_LoRA.sh output-lora 2；也换成3试了，也跑不起来。但是用以前版本的fineture用lora就可以跑起来，这是为啥是不是现在lora还不完善呢出现下面的错误： [2023-04-25 10:52:32,890] [INFO] [utils.py:793:see_memory_usage] CPU Virtual Memory: used = 47.61 GB, percent = 18.9% Traceback (most recent call last): File "main.py", line 402, in...

listwebit

关于效果的疑问

为什么vicuna13b只用了7万条指令数据就可以达到chatgpt的90%，而咱们这个项目用了指令数据都上百万条了，按理来说大模型的语言迁移能力应该很强啊，还是说vicuna的评测不够全面？

Minami-su

请问如何扩预训练扩展中文词表

你好，请问如何扩展词表：这里的word embeddings包含以下两层，还是1层。另外原有英文token emdeb有微调吗？还是使用旧的embed覆盖过去 1.LlamaForCausalLM.embed_tokens 2.LlamaForCausalLM.lm_head -------------- Train a tokenizer with a vocabulary of 50K tokens on 12M lines of Chinese text. Merge the trained vocabulary with the original LLaMA vocabulary,...

baibaiw5

论文效果对比

论文效果对比上，有实际对比过BLOOMZ-7B1-mt sft之后的结果吗？

HalcyonLiang

BELLE-LLaMA-EXT-7B 模型md5不一致

md5不一致 https://huggingface.co/BelleGroup/BELLE-LLaMA-EXT-7B 下载后pytorch_model-00001-of-00002.bin.3b0666c50d7fd55d5116e788ec51aa96a34ba6816e86ffbee1dbe983bf511b4b.enc的md5sum和官网上的不一致想请问如何解决

allenhung1025

数据加载阶段程序挂了

你好，我在45G的8卡上训练bloom-7b, 数据上加了一些中英文单语进去，共计6000w左右；然后数据在加载到30多万的时候就崩了。。。请问这个一般是什么原因导致的啊？ `length of train_dataset(after get_train_data): 59719579 100%|██████████| 1/1 [00:00

jiezhangGt

用BELLE-2/Belle-whisper-large-v2-zh识别中文音频，效果还不如Systran/faster-whisper-large-v2？

作者您好，我用BELLE-2/Belle-whisper-large-v2-zh跑实验效果还不如Systran/faster-whisper-large-v2 按道理在中文数据上finetune的模型性能应该比fasterwhisiper的好才对 ![8b13db23eb3623ca46e960604294ee4](https://github.com/LianjiaTech/BELLE/assets/14495189/d710b2e9-939e-4770-82de-d41414f47950) 我用的测试音频文件在这里 https://drive.google.com/file/d/1UTGOlnc3c_5FDHv_hH3IyNgNjxHNKQkD/view?usp=sharing 我是这么用的 ![76e519652388d62f9d030ec5ff0a196](https://github.com/LianjiaTech/BELLE/assets/14495189/f56a33bb-0f4b-4796-a7e2-78fb14dd2766) ![692c67b61fa13fc8a246cefebfc31b6](https://github.com/LianjiaTech/BELLE/assets/14495189/b20db703-e617-4b09-9c72-a7f1e8177fc7) 怎么才能弄出好的效果么

drilistbox

BELLE
BELLE copied to clipboard

Metadata

run docker: RuntimeError: Unable to proceed, no GPU resources available

请问下数据生成是用的davinci-003还是turbo-3.5？

查看了bloom-7B是基于FP16的参数，模型大小十几个G。为什么belle-7B模型大小来到了二十多个G，是从FP16转移到了FP32吗？

在docker环境下，run_LoRa有问题，3张32G的V100也跑不起来，用之前的finetune就可以跑起来

关于效果的疑问

请问如何扩预训练扩展中文词表

论文效果对比

BELLE-LLaMA-EXT-7B 模型md5不一致

数据加载阶段程序挂了

用BELLE-2/Belle-whisper-large-v2-zh识别中文音频，效果还不如Systran/faster-whisper-large-v2？

← Metadata

Owner

Metadata

BELLE BELLE copied to clipboard

Metadata

← Metadata

Owner

Metadata

BELLE
BELLE copied to clipboard