Joe Cummings comments

Results 278 comments of


                                            Joe Cummings

Save intermediate checkpoints during training

Hey @l3utterfly - thanks for creating the issue! This is definitely something that's on our minds, but is not currently available in the repo *yet*. Right now, if I'm worried...

Save intermediate checkpoints during training

> @joecummings I am trying with the `split: train [:25%]` method. > > My first 25% is done training, I see the PT files written to my output dir. However,...

Feature Request: Full Support for Direct Preference Optimization (DPO)

We now support this.

GPU Middle Class?

Hey @EugenHotaj - glad you're checking out torchtune. Up til now, we've managed to provide pretty extensive offerings including long-context, large models up to 405B, and RLHF all on single...

GPU Middle Class?

Hey - super glad to hear it works @tginart! I figured the actual changes to the multi-node script are minimal, but from our side we want to test as many...

GPU Middle Class?

#2301

Add Phi4

> @joecummings It seems to me that you haven't gone to holidays) Maybe you can give me some comments about this PR? Haha yes I'm still (somewhat) here. I asked...

Generate Command phi3 Error

> Can you run 'nvidia-smi' and confirm that there isnt any dead process consuming your memory before you run generate.py? > > However, there was a known issue where kvcache...

Bidirectional truncation in Llama4.

Would you like to open this up to community contributions?

Bidirectional truncation in Llama4.

Can you provide a few more details on how this could be implemented then? Put some acceptance criteria and code pointers?