[REQUEST] New Integration - NeptuneMonitor
Is your feature request related to a problem? Please describe. Add Neptune integration for DeepSpeed. Allow users who use DeepSpeed for model training to seamlessly also track training metrics to Neptune.
https://neptune.ai/
Describe the solution you'd like Add a NeptuneMonitor class as an additional monitoring option for users.
Describe alternatives you've considered N/A
Additional context N/A
Hi @LeoRoccoBreedt - we likely do not have the bandwidth to add this as a request, but would welcome PRs if you wanted to submit one to add NeptuneMonitor.
Hi @loadams - happy to submit a PR for this. Just wanted to follow the guidelines before jumping into it.
I see in the current implementations for the monitoring tools, the following is required in the main monitoring package script;
NeptuneMonitorlog()write_events()
I believe these are good enough for an MVP for the Neptune monitoring addition. Is there any docs on this of how the monitoring system works with DeepSpeed, for example, if users wanted to extend their monitoring outside of the write_events() method that is called during their training loops?
Hi @LeoRoccoBreedt - the PR here: https://github.com/deepspeedai/DeepSpeed/pull/5466 is a good starting point of the integration needed for a monitor.
Thanks @loadams! I'll start working on this soon.
@LeoRoccoBreedt - thanks, feel free to tag me or this issue in the PR when it is ready