Chen Qian

Results 69 comments of Chen Qian

@harupy @BenWilson2 Do you know if this has been handled?

This issue has been put under a low priority, but I think we can do a quick fix for it. Keep it open and we will welcome OSS contributions.

I am using TRL's `SFTTrainer` to train OPT (loaded in int8 + LoRA), and seeing the same issue. While ``` with torch.autocast("cuda"): trainer.train() ``` Solves the precision mismatch issue, it...

We can use it for DeBERTa if my understanding is correct, Abheesht should have more context. It's just a general approach ([paper](https://arxiv.org/pdf/1803.02155.pdf)), and I am planning to use it for...

@dbczumar Corey mind reviewing this PR? Also one question - now I am only adding control for fluent APIs, do you think we should have the same control over `MlflowClient`'s...

@harupy It's the cross version testing failure again, sigh... could you help merge? thanks! btw, I am naming it `autologging.py` instead of `autolog.py` because it has a name conflict with...

Thanks Weichen! I played with the notebook, the experience is pretty smooth. Two things about the file we save with checkpoints: - Can we put checkpoints along with the metrics...

close and reopen to disable the CI cache.

@jbischof Jon I believe you still have the training script?