Gopal Trital
Gopal Trital
@eric-mitchell I finished DPO, but when merging policy.pt, got the following error:    Am I doing something wrong in here? **Another issue:** - I have around 80 k...
> Sounds like you worked the lora part out! For loading the new checkpoint, the issue is that you need to load `torch.load(model_archive_name)['state']`, since the archived parameters are in the...
I'm using the same code as get_hh(), as my data is of same structure. When it downloads data from huggingface initially, it shows the same number of training and test...
> Sounds like you worked the lora part out! For loading the new checkpoint, the issue is that you need to load `torch.load(model_archive_name)['state']`, since the archived parameters are in the...
@eric-mitchell I have figured out where it reduces the training example size. In the following section in preference_datasets.py  You can see that I've 349 examples in training dataset: but...