Shuo Zhang

Results 10 comments of Shuo Zhang

@tjruwase My devices is 8*A100 (80G) and 1024G of RAM. And I have fonud another solution. I found that the `pin_memory: false` in `ds_config` didn't do anything. So I add...

Hi @x54-729 I met the exactly same problem and I notice that it will work fine when I disable the **offload optimizer** feature. Not sure why.

Hi @ctjian 出现该问题是由于您的内存与显存均无法容纳该模型,因此需要将模型的一部分参数下移至硬盘。您可以尝试; 1. 换用显存更大的显卡或者增加显卡数量。请通过设置环境变量 `CUDA_VISIBLE_DEVICES` 来设置模型可用的显卡数量,例如 [该位置](https://github.com/OpenLMLab/MOSS/blob/5775a3ef16338550efc96fdf7da06a45de69af3e/moss_cli_demo.py#L2); 2. 提高 CPU 可用内存; 3. 在 `load_checkpoint_and_dispatch` 函数中设置 `offload_folder` 参数,例如 [该位置](https://github.com/OpenLMLab/MOSS/blob/5775a3ef16338550efc96fdf7da06a45de69af3e/moss_cli_demo.py#L31)(请注意该操作可能严重降低模型推理的效率)。

单精度推理无法在 CPU 上执行,您需要将 `model` 与 `input` 全部转移到您的 GPU上,或者将 `model` 的 `dtype` 设置为 `torch.float32`。

您可能需要运行: ```python model = model.cuda() inputs["input_ids"] = inputs["input_ids"].cuda() inputs["attention_mask"] = inputs["attention_mask"].cuda() ```

InternLM2-1.8B has been open-sourced. internlm/internlm2-1_8b: https://huggingface.co/internlm/internlm2-1_8b internlm/internlm2-chat-1_8b-sft: https://huggingface.co/internlm/internlm2-chat-1_8b-sft

@xiaopqr 您好,很抱歉造成您使用当中的不便,此问题已在 1871bcb26a4d879a25914e3daf909dc0ee636053 中修复,请使用dev分支的版本,或者等待下个版本的主分支合并。

Hi @DesperateExplorer , Collie can use models from transformers, in the case of ZeRO parallelism. But you need to execute ``setup_distribution`` manually: ```python from collie import setup_distribution, CollieConfig from transformers...