dlrover
dlrover copied to clipboard
Torch Trainer Hook
For this issue, the objective is to create a hook or callback system in our PyTorch trainer that would allow it to invoke resource monitoring and time reporting at the start of training. This hook should be well-integrated into the training process and should not interfere with the main training tasks.
We need to design this hook in a way that it can trigger our resources reporter, and potentially, other types of monitors we may add in the future.
We can implement this manually or use callback mechanisms similar to what is available in PyTorch Lightning.
We can support the Trainer in lighting and implement a lighting callback
We can support the
Trainerin lighting and implement a lighting callback
Thank you for your prompt and helpful response! I'll definitely look into implementing the lighting callback as you suggested. Your provided link will be a valuable resource for my implementation. Thanks again!
This issue has been automatically marked as stale because it has not had recent activity.