ZeyuTeng96 issues

Results 16 issues of


ZeyuTeng96

关于文本摘要开源例子的问题

您好，请问如下链接中开源的3个文本摘要的例子，其中的哪个模型适用于中文的文本摘要任务呢？我记得好像3个开源例子所使用的数据集都是英文数据集。 https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_summarization

如何基于现有的开源英文plato-2模型，搭建一个中文多轮对话机器人

各位大佬，请问如何基于现有的开源英文plato-2模型，搭建一个中文多轮对话机器人？本人看了下面的链接，但还是对如何使用英文的plato-2搭建适用于中文多轮对话任务的plato-2模型表示不太了解。能否请各位大佬提供一些更详细的细节？还能否请各位已经实现的大佬共享一些代码供小弟参考，谢谢。链接： https://github.com/PaddlePaddle/Knover/issues/25

Try to infer from hf model, but producint nothing

Hi friends, I tried to load a hf llama model and used the generation script (with slightly modification) which is provided on this repo for inference. But I got nothing....

Train alpaca with a small set of official data set. But going into messy

Hi, can anyone help me about it. I tried to use a small set of official dataset for training. However, the train loss is quite high. And the responses which...

Alpaca problem solving team - QQ chat group

Hi all friends, welcome to join in QQ chat group and discuss all problems and experience. The QQ chat group number is: 397447632

后续会实现Deepspeed Chat的3个步骤吗？

目前看到train文件夹已经集成了deepspeed chat的sft流程，请问后续会实现剩余的两步吗？并且提供一些开源的奖励模型数据集等？

出现如下warning: tried to get lr value before scheduler/optimizer started stepping, returning lr=0

您好，在使用finetune脚本使用指令微调数据集微调bloom-7b模型时前几个step出现： tried to get lr value before scheduler/optimizer started stepping, returning lr=0 这个warning是什么原因呢？ bloom config为: { "model_type": "bloom", "model_name_or_path": "bigscience/bloomz-7b1-mt", "data_path": "data/res/merge_data.json", "output_dir": "trained_models/bloom", "per_device_train_batch_size": 1, "num_epochs": 2, "learning_rate": 1e-5,...

关于使用hf run_clm对llama进行预训练的疑问

### 问前必查项目 - [x ] 由于相关依赖频繁更新，请确保按照[Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki)中的相关步骤执行 - [x ] 我已阅读[FAQ章节](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/常见问题)并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 - [x ] 第三方插件问题：例如[llama.cpp](https://github.com/ggerganov/llama.cpp)、[text-generation-webui](https://github.com/oobabooga/text-generation-webui)、[LlamaChat](https://github.com/alexrozanski/LlamaChat)等，同时建议到对应的项目中查找解决方案 ### 选择问题类型基础模型： - [x ] LLaMA 问题类型： - [x ] 其他问题 ### 详细描述问题大佬好，我这里也想使用hf的run_clm对llama模型进行继续预训练。其中，预训练的无标注文本为.txt的文件，每行都是一个文本或者一个自然段。但是，run_clm的脚本里会在把每行的文本做tokenize后，进行一个group_text的操作，默认至1024的长度。这样会导致，每行的文本如果不够1024的长度的话，会把下一行的文本和当前的文本进行一个拼接，是这么理解吗？...

stale

Create chatglm_stream_api.py

Adding stream api feature