LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

sft_packing实现的问题

Open dyh1996 opened this issue 2 years ago • 4 comments

Reminder

  • [X] I have read the README and searched the existing issues.

Reproduction

看目前sft_packing的实现只是单纯将不同的单轮sft数据拼接到一起,然后分别计算target部分的loss

def preprocess_packed_supervised_dataset( examples: Dict[str, List[Any]], tokenizer: "PreTrainedTokenizer", template: "Template", data_args: "DataArguments", ) -> Dict[str, List[List[int]]]: # build inputs with format <bos> X1 Y1 <eos> <bos> X2 Y2 <eos> # and labels with format <ignore> ... <ignore> Y1 <eos> <ignore> ... <ignore> Y2 <eos> model_inputs = {"input_ids": [], "attention_mask": [], "labels": []}

这里是不是应该增加对position_ids的修改呢?从而保证每条单轮sft在计算loss的时候不会受到其他拼接的上文影响

Expected behavior

No response

System Info

No response

Others

No response

dyh1996 avatar Jan 22 '24 09:01 dyh1996

请问一下,对于packing的方式(尤其是sft的情况下),除了上面提到的pos,是不是应该设置合适的atten mask,来隔离不同的instance呢?

muzhi1991 avatar Apr 20 '24 14:04 muzhi1991

@hiyouga Has LLama-Factory implemented this 'https://github.com/MeetKai/functionary/tree/main/functionary/train/packing#assert-implementation' for Packing yet? I did notice the 'preprocess_packed_supervised_dataset' part of the code in the repo.

DinhLuan14 avatar Apr 22 '24 07:04 DinhLuan14

any update on this issue? @hiyouga

Ricardokevins avatar Apr 30 '24 09:04 Ricardokevins

llama 3也修改了attention mask,但没提position id,position id真的有必要修改吗?rope本身就是相对编码

chiosChen avatar May 19 '24 03:05 chiosChen

请问一下,对于packing的方式(尤其是sft的情况下),除了上面提到的pos,是不是应该设置合适的atten mask,来隔离不同的instance呢?

同样的问题,为什么不考虑处理atten_mask。单纯拼接,后面的数据能看到前面的数据的意义在哪?

lugimzzz avatar Jun 13 '24 09:06 lugimzzz

@hiyouga Has LLama-Factory implemented this 'MeetKai/functionary@main/functionary/train/packing#assert-implementation' for Packing yet? I did notice the 'preprocess_packed_supervised_dataset' part of the code in the repo.

The function 'preprocess_packed_supervised_dataset' does not currently implement atten_mask for other instances.

@hiyouga, do you have any plans to add this feature in the future?

letterk avatar Jun 15 '24 05:06 letterk

@letterk will be fixed after merging #4224

hiyouga avatar Jun 15 '24 06:06 hiyouga