Clara Pohland issues

Results 5 issues of


                                            Clara Pohland

[KTO]: Fix nan losses and crashing job

fixes #1447 - use `nanmean()` instead of `mean()` for losses to avoid nan losses - remove obsolete `accelerator.gather` for metrics as the metrics are all collected to cpu anyway and...

KTO training produces NaN rewards

Within the training with KTO Trainer I occasionally experience `nan` values as rewards. I am running the training as a job on Ms Azure with one GPU (NVIDIA A100 80GB...

KTO - support loading the adapter twice

For DPOTrainer there exists the option to load the Adapter from SFT training twice, as in [Reference model considerations with PEFT - load-the-adapter-twice](https://huggingface.co/docs/trl/main/en/dpo_trainer#using-option-3---load-the-adapter-twice): ``` python # Initialize the trainer, without...

KtoTrainer: BCO improvements

I recently experimented quite a bit with the BCO loss type in the KTO Trainer. This PR includes some changes that helped me to successfully and effectively run BCO on...

es_out: support Upstream Servers

This should enable Upstream feature (https://docs.fluentbit.io/manual/configuration/upstream_servers) support for the Elasticsearch Output Plugin. It was tested in a local setup with two Elasticsearch instances (es-1 & es-2). Here is some example...

waiting-for-user

no-update-dismiss