FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

LM_Cocktail "Merge based on samples" ERROR: Subtraction, the `-` operator, with a bool tensor is not supported

Open charliedream1 opened this issue 1 year ago • 0 comments

Problem: megerging chatglm3 with samples, it gives out error

Traceback (most recent call last): File "~/llm_cocktail/mix_mdl.py", line 67, in model2 = mix_models_with_data( File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/LM_Cocktail/cocktail.py", line 102, in mix_models_with_data weights = compute_weights(model, tokenizer=tokenizer, param_list=param_list, model_type=model_type, File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/LM_Cocktail/utils.py", line 135, in compute_weights loss = loss_func(base_model=base_model, input_data=input_data) File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/LM_Cocktail/utils.py", line 230, in llm_loss output = base_model(**data) File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "~/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 937, in forward transformer_outputs = self.transformer( File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "~/miniconda3/envs/train_py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "~/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 819, in forward full_attention_mask = self.get_masks(input_ids, past_key_values, padding_mask=attention_mask) File "~/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 690, in get_masks full_attention_mask -= padding_mask.unsqueeze(-1) - 1 RuntimeError: Subtraction, the - operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~ or logical_not() operator instead.

charliedream1 avatar Jan 24 '24 02:01 charliedream1