Carlos Gomes

Results 5 issues of Carlos Gomes

Hey Folks! I've had a really good time playing with torchtitan so far :) Looking into the code, as it stands, the way the data loader is wrapped by next_batch...

For some workloads, it is really important to perform validation on a different dataset every n iterations. This seems reasonably straight forward to add to the training loop and training...

Hey Folks! I've had a really good time playing with torchtitan so far :) Looking into the code, as it stands, the way the data loader is wrapped by `next_batch`...

- Add val loss - Add batched inference Ideally we would also add COCO2014 as dataset. However, I havent been able to find a hf dataset containing both the images...

CLA Signed

### Bug description After loading from checkpoint, the loss spikes and then returns to expected values after a few steps. To repo, run a first training, storing a checkpoint, and...