xiaozhu1106 issues

Results 4 issues of


                                            xiaozhu1106

使用transformers方式，GPU是否有加载量化模型的方式？

### 问前必查项目 - [x] 由于相关依赖频繁更新，请确保按照[Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki)中的相关步骤执行 - [x] 我已阅读[FAQ章节](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/常见问题)并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 - [x] 第三方插件问题：例如[llama.cpp](https://github.com/ggerganov/llama.cpp)、[text-generation-webui](https://github.com/oobabooga/text-generation-webui)、[LlamaChat](https://github.com/alexrozanski/LlamaChat)等，同时建议到对应的项目中查找解决方案 ### 选择问题类型基础模型： - [ ] LLaMA - [x] Alpaca 问题类型： - [ ] 下载问题 - [ ] 模型转换和合并问题...

stale

预训练数据筛选处理

预训练数据筛选处理的方法：数据源质量分过滤和 tf-idf soft deduping，提到的两种过滤方式是否可以提供具体的实现呢

ceval的评测结果，是基于预训练模型的吗？评测脚本报错

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/baichuan-inc/baichuan-7B/issues) and [Discussions](https://github.com/baichuan-inc/baichuan-7B/discussions) that this hasn't already been reported. (+1 or comment...

question

对比lora优势是什么

您好，有几点请教下； 1.预训练使用lora，也是只训练lora新增加的参数。那和lora对比优势是什么呢？ 2.这种方式预训练时，避免遗忘，增加领域数据时，还需要增加适当的通用数据混合吗？ 3.sft阶段，是使用的全参训练吧，那sft阶段还是避免不了遗忘呢