Mihir Patel

Results 172 comments of Mihir Patel

@ez2rok apologies for the delayed turnaround. We're happy to review this -- the first. step would be to update the tests and ensure they are all passing. Once that's done,...

Per offline discussion, we will close PR.

Closing as this is stale and I spent some time but I'm unable to reproduce :( If this is still an issue, please feel free to reopen! I am happy...

> @mvpatel2000: Distillation loads + performs inference on a second model in parallel with the model being trained. Do you have any suggestions on how to make this compatible with...

Closing for now

@RolandGao I believe the intention with `num_checkpoints_to_keep` was to be set per run. The reasoning here was that if a run is resumed, we don't want to delete previous checkpoints...

Hm... I see. In this case, I would recommend adding your own callback that deletes an older checkpoint whenever a newer version gets written for now. In the mean time,...

@viyjy can you please detail your workload so we can try to reproduce this?

This should be fixed in 0.16.1! Feel free to reopen if it's still an issue

The pydantic error comes from Deepseed. They failed to pin and error. We don't use pydantic