Joe Cummings
Joe Cummings
Hey @l3utterfly - thanks for creating the issue! This is definitely something that's on our minds, but is not currently available in the repo *yet*. Right now, if I'm worried...
> @joecummings I am trying with the `split: train [:25%]` method. > > My first 25% is done training, I see the PT files written to my output dir. However,...
We now support this.
Hey @EugenHotaj - glad you're checking out torchtune. Up til now, we've managed to provide pretty extensive offerings including long-context, large models up to 405B, and RLHF all on single...
Hey - super glad to hear it works @tginart! I figured the actual changes to the multi-node script are minimal, but from our side we want to test as many...
#2301
> @joecummings It seems to me that you haven't gone to holidays) Maybe you can give me some comments about this PR? Haha yes I'm still (somewhat) here. I asked...
> Can you run 'nvidia-smi' and confirm that there isnt any dead process consuming your memory before you run generate.py? > > However, there was a known issue where kvcache...
Would you like to open this up to community contributions?
Can you provide a few more details on how this could be implemented then? Put some acceptance criteria and code pointers?