Open-Llama icon indicating copy to clipboard operation
Open-Llama copied to clipboard

The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.

Results 16 Open-Llama issues
Sort by recently updated
recently updated
newest added

I am currently doing some tests on different open source conversational models available on the web. In this context I would like to test OpenLlama at its full capabilities (so...

Dear authors Amazing work on this repository, I am interested in knowing more about your plans regarding RLHF - what framework do you had in mind for achieving this and...

First of all, thank you for this project! I had some questions about how training was done, as I've struggled to scale up training larger model sizes when using transformers...

请问一下指令微调(instruct finetune)可以使用32GB显存的v100吗? 目前有4卡的v100,每张卡显存是32G,有没有办法可以使用4卡的v100进行指令微调操作? 试过了stage2还是报显存不够。 试过ds_stage3的配置文件,发现报了如下错误,有人知道是什么原因吗?非常感谢 启动命令如下:accelerate launch --config_file configs/accelerate_configs/ds_stage3.yaml train_lm.py --train_config configs/instruct_config.yaml --model_config configs/model_configs/7B.json 报错如下: File "/home/fenbi/miniconda3/envs/mc-model/lib/python3.9/site-packages/transformers/models/open_llama/modeling_open_llama.py", line 385, in _init_weights module.weight.data[module.padding_idx].zero_() IndexError: index 32000 is out of bounds for...

请问你这边在finetune llama的时候,各个参数如7B,13B,30B各个节点所用资源的数量,有统计过吗? 比方单卡时gpu显存,cpu内存占用?八卡时gpu显存,cpu内存占用? 我现在发现全参数finetune的时候cpu内存占用特别厉害,不知道你这边有没有什么特别的优化没?

There is no "Lumphini beach" and there is no monkey forrest to play in Bangkok. It seems like a bad example to have as main example for the repo. Maybe...

您好,我在您的V1版预训练模型上做SFT,用2批数据,对比在BLOOM 3B上跑SFT,结果,您这个发布的预训练基座,效果奇差,不知道是什么原因?您这边有什么建议吗

@s-JoL 非常感谢分享这么好的工作,中文预训练模型真的好稀缺。 我想了解一下,目前的预训练模型,有在一些评测集上测试过指标吗?

llama应该默认没有启用bias项。但按照苏神最新思路,把q,k的bias项加回来可以明显提升长度外推性能,作者考虑预训练测试一下不 https://kexue.fm/archives/9577

Hello, Tried to download the checkpoints of open llama v2 from https://huggingface.co/s-JoL/Open-Llama-V2 The link is not available anymore. Tried to do the same from python tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V2", use_fast=False) Traceback...