Philip May

Results 184 comments of Philip May

Ahh I see. Very nice to have a 2nd maintainer. :-) Many thanks.

@kawine reading your plot -> is it possible that you train on multiple different machines with one GPU each? It reads "copper-paper" and "noble-pyramid". I think thats names coming from...

Should be very easy to test this on Phi-2 or TinyLlama when the implementation works?

This PR should maybe also add a few lines to the README about "how to use this".

Hi @lewtun , we had a discussion about KTO. Do you already work on this or should we come up with a PR? We would try and use the code...

That feature would be super useful @claralp . Thanks.

As far as I know there was a ruling in the US that AI generated content cannot be licensed. In this context, it is questionable from my point of view...

this is implemented via #177 - closing this

> Use synchronized batch normalization Using sync batch norm does not help with single GPU training and low batch sizes though.

I am not from Facebook or Meta but AFAIK: When you finetune model a to model b and then release model b: The license of model b must match the...