MOSS issues

微调保存时，8张显卡的结果都保存结果需要很大存储空间，是否可以在一张显卡中保存

3

finetune_moss.py if global_step % args.save_step == 0 and torch.cuda.current_device() == 0: model.save_checkpoint(args.output_dir, global_step)

yangzhipeng1108

Merge APIs and gradio into one process

Add a demo for the API server and gradio webui in same process

commissarster

微调时报错，FileNotFoundError: No such file or directory: './sft_data/train.jsonl'

16

微调时提示将数据集按照 conversation_without_plugins格式处理并放到 sft_data 目录中，我自己没有数据集，所以直接将run.sh里面的--data_dir ./sft_data \改为了--data_dir ./SFT_data/conversations/conversation_without_plugins/harmless_conversations \，但是仍然报错FileNotFoundError: [Errno 2] No such file or directory: './SFT_data/conversations/conversation_without_plugins/harmless_conversations/train.jsonl' 请问为啥自动加了train.jsonl？如何修改 ![QQ截图20230508170310](https://user-images.githubusercontent.com/55686901/236783522-3f985d3c-2f9b-4f36-b010-41fe7c7f2956.jpg)

lizhidomg

OutOfMemoryError: CUDA out of memory.

6

硬件环境：`RTX A5000(24GB) * 5` 内存：`210GB` 模型：`moss-moon-003-base` 训练报错，提示： ```bash OutOfMemoryError: CUDA out of memory. Tried to allocate 3.80 GiB (GPU 0; 23.69 GiB total capacity; 17.46 GiB already allocated; 850.56 MiB...

sk142857

复现moss_cli_demo.py已经下载过moss-moon-003-sft，微调时运行run.sh时重复下载模型，但是找不到模型存储的路径

2

![微信图片_20230505223112](https://user-images.githubusercontent.com/55686901/236487339-23b04cd2-e91a-45bc-a937-46ac03803da6.png)

lizhidomg

增加模型逻辑推理能力

看到未来计划中有强化逻辑推理能力。目前看到披露的做法是增大基座模型、增加特定训练数据。请问这块有更多细节披露吗，比如特定训练数据是在预先训练还是微调阶段，特定训练数据的形式，等等 thx

jarheadjoe

NameError: name 'transpose_matmul_248_kernel' is not defined

2

''' ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ in :14 │ │ │ │ 11 │ │ metric(output.logits, labels, loss) │ │ 12 │ │ acc, train_loss = metric.get_metric()...

xia-huang-411303

关于finetune训练时提示no attribute 'MossTokenizer'、'MossConfig'、'MossForCausalLM'

3

大神们好。最近在finetune训练的时候，经常会提示下面的这些错误： **报错1：** ``` Traceback (most recent call last): File "finetune_moss.py", line 305, in train(args) File "finetune_moss.py", line 177, in train tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, trust_remote_code=True) File "/root/anaconda3/envs/moss/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 626, in from_pretrained...

Tian14267

int4 推理结果较差

1

``` 欢迎使用 MOSS 人工智能助手！输入内容即可进行对话。输入 clear 以清空对话历史，输入 stop 以终止对话。 : 小明的妈妈有三个儿子，大儿子叫大猫，二儿子叫二毛，请问三儿子叫什么 : 很抱歉，我无法回答这个问题。因为小明的妈妈没有提到三儿子的名字。 : 我不是暗示你了吗？ : 非常抱歉，我误解了您的意思。如果您需要帮助，请告诉我您需要什么方面的帮助。 : 小明的妈妈，另外两个儿子的名字，推断不出第三个？ : 您的推测是正确的。如果小明的妈妈没有提到第三个儿子的名字，那么我们可以猜测他可能有一个非常个性化的名字，或者他可能还没有出生。 : 我让你推断第三个儿子的名字 : 非常抱歉，我不能推断出第三个儿子的名字。因为我只能根据已知的信息进行推理和推断，但是无法确定小明的妈妈是否已经告诉了您第三个儿子的名字。如果您需要了解有关小明的更多信息，我可以帮您查找相关的信息。 : 下雨天人们都会带伞，那么我看到一个人打伞，能推断出是下雨天吗 : 是的，您可以推断出这个人是在下雨天打伞。当您看到一个人打伞时，可以判断出天气可能会下雨。在雨天，打伞是一种常见的行为，它可以帮助人们避免被雨水淋湿，同时也能保护人们的财产免受雨水的损害。因此，如果一个人打伞，那么很可能是在下雨天。当然，这只是一个简单的推断，实际情况还需要考虑更多的因素，例如地点、时间、天气状况等等。 : ```...

lucasjinreal

微调如何指定GPU

测试微调的时候发现GPU永远会用第一张卡，请问要怎么指定用那几张卡呢？目前尝试了在run.sh中加入CUDA_VISIBLE_DEVICES=1，但是还是第0张卡，在fine_tuning.py中指定os也是不行以下是我的命令： num_machines=1 num_processes=1 machine_rank=0 CUDA_VISIBLE_DEVICES=3 accelerate launch \ --config_file ./configs/sft.yaml \ --num_processes $num_processes \ --num_machines $num_machines \ --machine_rank $machine_rank \ --deepspeed_multinode_launcher standard finetune_moss.py \ --model_name_or_path fnlp/moss-moon-003-base \ --data_dir...

lukaswangbk

MOSS
MOSS copied to clipboard

Metadata

微调保存时，8张显卡的结果都保存结果需要很大存储空间，是否可以在一张显卡中保存

Merge APIs and gradio into one process

微调时报错，FileNotFoundError: No such file or directory: './sft_data/train.jsonl'

OutOfMemoryError: CUDA out of memory.

复现moss_cli_demo.py已经下载过moss-moon-003-sft，微调时运行run.sh时重复下载模型，但是找不到模型存储的路径

增加模型逻辑推理能力

NameError: name 'transpose_matmul_248_kernel' is not defined

关于finetune训练时提示no attribute 'MossTokenizer'、'MossConfig'、'MossForCausalLM'

int4 推理结果较差

微调如何指定GPU

← Metadata

Owner

Metadata

MOSS MOSS copied to clipboard

Metadata

← Metadata

Owner

Metadata

MOSS
MOSS copied to clipboard