DamonYangyang
DamonYangyang
Do you have the same question, loss_ IOU and loss_ Box converges very slowly, and their values hover around 0.4
UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor....
In the reward model implementation, I noticed these two lines of code, ` c_truncated_reward = chosen_reward[divergence_ind:end_ind] r_truncated_reward = rejected_reward[divergence_ind:end_ind]` It should take the answer part, but chosen and rejected take...