MOSS icon indicating copy to clipboard operation
MOSS copied to clipboard

An open-source tool-augmented conversational language model from Fudan University

Results 292 MOSS issues
Sort by recently updated
recently updated
newest added

with open(os.path.join(self.data_dir, f'{self.data_type}.jsonl'), 'r') as f: for line in f: sample = json.loads(line) chat = sample['chat'] num_turns = int(sample['num_turns']) meta_instruction = sample['meta_instruction'] instruction_ids = self.tokenizer.encode(meta_instruction) assert isinstance(instruction_ids, list) and len(instruction_ids)...

在搜索引擎插件的对话流程中,最后moss回答的可以理解为答案来自于第几条内容的定位么。 为什么在我自己的问题下,模型不会在最后加上这一句内容的定位,或者加上了却准确率很低? 我用同样的模板,进行提问,只是修改了其中的内容。 ![image](https://github.com/OpenLMLab/MOSS/assets/115603416/afc1cdd0-879f-4e1e-ab34-a841e38e7d32)

在finetune代码中,部分也加入了loss的计算,想请教下这样相比conditioning language modeling loss有什么特别的好处吗?

邮箱:[email protected] 另问4090显卡需要量化吗

@linonetwo @jsl9208 @xpqiu @meta-tabchen 您好 ``` with open(os.path.join(self.data_dir, f'{self.data_type}.jsonl'), 'r') as f: for line in f: sample = json.loads(line) chat = sample['chat'] num_turns = int(sample['num_turns']) meta_instruction = sample['meta_instruction'] instruction_ids =...

请问在微调大模型的插件能力时,需要用到这个数据吗 https://github.com/OpenLMLab/MOSS/tree/main/SFT_data/conversations/conversation_with_plugins/web_search 这里的Inner Thoughts, Tool Responses 这些数据 怎么用于训练呢?

finetune_moss.py 中修改如下 accelerator = Accelerator(mixed_precision='fp8') 环境用的nvidia的容器 nvcr.io/nvidia/pytorch:23.06-py3 https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch 因计算卡显存不足,DeepSpeed offload cpu 修改 sft.yaml 如下 command_file: null commands: null compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 1 gradient_clipping: 1.0 offload_optimizer_device: cpu offload_param_device: cpu zero3_init_flag:...

def load_from_torch_shard_ckpt(model, ckpt_dir): """ Load sharded checkpoints directly from huggingface dir. """ with open(os.path.join(ckpt_dir, 'pytorch_model.bin.index.json')) as fp: ckpt_index = json.load(fp) total_size = ckpt_index['metadata']['total_size'] weight_map = ckpt_index['weight_map'] file_weight_map = {} for...