someone

Results 3 issues of someone

在106行的样子,有一个 if char_list[i] not in emit_dict[line_status[i]]: #若当前词未出现在观测概率矩阵中,就将其加进来 emit_dict[line_status[i]][char_list[i]] = 0.0 #emit_dict[line_status[i]][char_list[i]] = 1.0——>已经由状态line_status[i]得到了观测char_list[i],所以应该初始化为1.0才对 else: emit_dict[line_status[i]][char_list[i]] += 1 # 用于计算发射概率 我觉得应该初始化为1.0才对吧

首先是chatglm1中get_masks的实现逻辑: ```` def get_masks(self, input_ids, device): batch_size, seq_length = input_ids.shape context_lengths = [seq.tolist().index(self.config.bos_token_id) for seq in input_ids] attention_mask = torch.ones((batch_size, seq_length, seq_length), device=device) attention_mask.tril_() for i, context_length in enumerate(context_lengths): attention_mask[i,...

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 基于`main.py`代码改造的,加了lora微调的逻辑,目前没有实现多卡微调,只能单卡微调🤣 训练设置`max_steps=5000`,`save_step=1000`,训练完成后eval/predict f1=0.68/0.71;后加载checkpoint-5000的eval/predict f1=0.69/0.73😱; 而且还发现一个奇怪的现象:加载checkpoint-5000模型,设置不同的per_device_eval_batch_size,eval/predict f1 也不同😱。。。 per_device_eval_batch_size=6:eval/predict f1=0.68/0.73 per_device_eval_batch_size=16:eval/predict f1=0.69/0.74...