dlrover
dlrover copied to clipboard
In Ascend NPU, stop workers together with its children processes
What changes were proposed in this pull request?
Be adapt to Ascend NPU cases, to stop workers and make sure no remaining processes are using NPU
Why are the changes needed?
In Ascend NPU, workers will fork many child processes, and we need to clear all of them
Does this PR introduce any user-facing change?
NO
How was this patch tested?
UT
Codecov Report
Attention: Patch coverage is 95.34884% with 2 lines in your changes missing coverage. Please review.
Project coverage is 80.57%. Comparing base (
6764a09) to head (c2e69a4). Report is 104 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| dlrover/python/elastic_agent/torch/training.py | 89.47% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #1284 +/- ##
==========================================
+ Coverage 80.41% 80.57% +0.15%
==========================================
Files 219 219
Lines 20126 20167 +41
==========================================
+ Hits 16185 16249 +64
+ Misses 3941 3918 -23
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
repeated https://github.com/intelligent-machine-learning/dlrover/pull/1331