Louis

Results 6 issues of Louis

### 详细描述问题 想请教一下扩充中文词表的必要性,基于unicode编码的tokenizer理论上可以支持任意中文字符编码,那么如果不扩充中文词表,直接使用中文预料进行训练和微调,对模型效果的影响有多大?因为看到一些其他语种比如日语的工作,是没有扩充词表的。wiki里也说了“大多数相关衍生工作是直接在原版上进行pretrain/finetune的”,所以区别只是编码效率或者最终质量问题,是这样吗? ### 运行截图或log ### 必查项目 - [x] 哪个模型的问题:LLaMA / Alpaca **(只保留你要问的)** - [x] 问题类型:**(只保留你要问的)** - 效果问题 - 其他问题 - [x] 由于相关依赖频繁更新,请确保按照[Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki)中的相关步骤执行 - [x] 我已阅读[FAQ章节](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/常见问题)并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 - [x] 第三方插件问题:例如[llama.cpp](https://github.com/ggerganov/llama.cpp)、[text-generation-webui](https://github.com/oobabooga/text-generation-webui)、[LlamaChat](https://github.com/alexrozanski/LlamaChat)等,同时建议到对应的项目中查找解决方案

Hi teams, I'm fine-tuning with 6 V100 GPUs. The fine-tuning process is extremely slow for me. I'm using fp16 and attn_impl: torch, with a global_train_batch_size of 12 and device_train_microbatch_size automatically...

Thanks for your great work. Im running a mpt model with nvidia v100 gpu. I think the compilation process went well, but GPU cannot be utilized during inference. Here is...

您好,使用v100进行多卡训练总会遇到超时错误,4卡、2卡均报错。使用单卡似乎没有这种问题但是速度较慢。微调5w数据大约需要12小时。 ```bash [E ProcessGroupNCCL.cpp:828] [Rank 1] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=80, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1805926 milliseconds before timing out. [E ProcessGroupNCCL.cpp:828] [Rank 0] Watchdog caught collective operation timeout:...

pending

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/baichuan-inc/baichuan-7B/issues) and [Discussions](https://github.com/baichuan-inc/baichuan-7B/discussions) that this hasn't already been reported. (+1 or comment...

question