MOSS issues

with open(os.path.join(self.data_dir, f'{self.data_type}.jsonl'), 'r') as f: for line in f: sample = json.loads(line) chat = sample['chat'] num_turns = int(sample['num_turns']) meta_instruction = sample['meta_instruction'] instruction_ids = self.tokenizer.encode(meta_instruction) assert isinstance(instruction_ids, list) and len(instruction_ids)...

jouw

关于搜索引擎插件的对话流程中，最后的<sup><|1|></sup>问题。

3

在搜索引擎插件的对话流程中，最后moss回答的可以理解为答案来自于第几条内容的定位么。为什么在我自己的问题下，模型不会在最后加上这一句内容的定位，或者加上了却准确率很低？我用同样的模板，进行提问，只是修改了其中的内容。 ![image](https://github.com/OpenLMLab/MOSS/assets/115603416/afc1cdd0-879f-4e1e-ab34-a841e38e7d32)

liushiton

finetuning loss

1

在finetune代码中，部分也加入了loss的计算，想请教下这样相比conditioning language modeling loss有什么特别的好处吗？

ZiboZ

求邀请码

邮箱：[email protected] 另问4090显卡需要量化吗

funex06

多轮对话数据处理

1

@linonetwo @jsl9208 @xpqiu @meta-tabchen 您好 ``` with open(os.path.join(self.data_dir, f'{self.data_type}.jsonl'), 'r') as f: for line in f: sample = json.loads(line) chat = sample['chat'] num_turns = int(sample['num_turns']) meta_instruction = sample['meta_instruction'] instruction_ids =...

447428054

moss-003-sft-plugin-data的30w条数据都是人工写的吗？

或者是有自动生成的方法吗？

amenota

Inner Thoughts, Tool Responses 这些数据怎么训练？

请问在微调大模型的插件能力时，需要用到这个数据吗 https://github.com/OpenLMLab/MOSS/tree/main/SFT_data/conversations/conversation_with_plugins/web_search 这里的Inner Thoughts, Tool Responses 这些数据怎么用于训练呢？

wjy3326

使用fp8 后微调速度特别慢

finetune_moss.py 中修改如下 accelerator = Accelerator(mixed_precision='fp8') 环境用的nvidia的容器 nvcr.io/nvidia/pytorch:23.06-py3 https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch 因计算卡显存不足，DeepSpeed offload cpu 修改 sft.yaml 如下 command_file: null commands: null compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 1 gradient_clipping: 1.0 offload_optimizer_device: cpu offload_param_device: cpu zero3_init_flag:...

lhtpluto

使用jittor加载模型的时候这个文件pytorch_model.bin.index.json是如何生成的？

1

def load_from_torch_shard_ckpt(model, ckpt_dir): """ Load sharded checkpoints directly from huggingface dir. """ with open(os.path.join(ckpt_dir, 'pytorch_model.bin.index.json')) as fp: ckpt_index = json.load(fp) total_size = ckpt_index['metadata']['total_size'] weight_map = ckpt_index['weight_map'] file_weight_map = {} for...

wanglaiqi

MOSS
MOSS copied to clipboard

Metadata

moss-003-sft-plugin-data里面有很多错误

finetune的时候为何没有把<|Human|>的loss给mask掉？

关于搜索引擎插件的对话流程中，最后的<sup><|1|></sup>问题。

finetuning loss

求邀请码

多轮对话数据处理

moss-003-sft-plugin-data的30w条数据都是人工写的吗？

Inner Thoughts, Tool Responses 这些数据怎么训练？

使用fp8 后微调速度特别慢

使用jittor加载模型的时候这个文件pytorch_model.bin.index.json是如何生成的？

← Metadata

Owner

Metadata

MOSS MOSS copied to clipboard

Metadata

← Metadata

Owner

Metadata

MOSS
MOSS copied to clipboard