sagemaker-debugger icon indicating copy to clipboard operation
sagemaker-debugger copied to clipboard

Fix Flaky Pytorch Multiprocessing Test

Open NihalHarish opened this issue 5 years ago • 2 comments

Description of changes:

  • The dataset must be prepared outside of the child processes as opposed to inside of them which leads to race conditons.

Style and formatting:

I have run pre-commit install to ensure that auto-formatting happens with every commit.

Issue number, if available

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

NihalHarish avatar Oct 03 '20 05:10 NihalHarish

Codecov Report

Merging #368 into master will decrease coverage by 2.78%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #368      +/-   ##
==========================================
- Coverage   85.47%   82.68%   -2.79%     
==========================================
  Files          86       86              
  Lines        6504     6504              
==========================================
- Hits         5559     5378     -181     
- Misses        945     1126     +181     
Impacted Files Coverage Δ
smdebug/pytorch/__init__.py 0.00% <0.00%> (-100.00%) :arrow_down:
smdebug/pytorch/singleton_utils.py 0.00% <0.00%> (-100.00%) :arrow_down:
smdebug/pytorch/collection.py 0.00% <0.00%> (-90.00%) :arrow_down:
smdebug/pytorch/hook.py 0.00% <0.00%> (-82.41%) :arrow_down:
smdebug/rules/action/stop_training_action.py 56.45% <0.00%> (-20.97%) :arrow_down:
smdebug/pytorch/utils.py 0.00% <0.00%> (-18.52%) :arrow_down:
smdebug/rules/req_tensors.py 79.16% <0.00%> (-11.12%) :arrow_down:
smdebug/core/tfevent/util.py 92.00% <0.00%> (-8.00%) :arrow_down:
smdebug/tensorflow/callable_cache.py 78.26% <0.00%> (-4.35%) :arrow_down:
smdebug/rules/action/action.py 91.83% <0.00%> (-4.09%) :arrow_down:
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update bb8f4b9...19fc0d7. Read the comment docs.

codecov-commenter avatar Oct 03 '20 05:10 codecov-commenter

Codecov Report

Merging #368 into master will decrease coverage by 2.78%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #368      +/-   ##
==========================================
- Coverage   85.47%   82.69%   -2.79%     
==========================================
  Files          86       86              
  Lines        6507     6507              
==========================================
- Hits         5562     5381     -181     
- Misses        945     1126     +181     
Impacted Files Coverage Δ
smdebug/pytorch/__init__.py 0.00% <0.00%> (-100.00%) :arrow_down:
smdebug/pytorch/singleton_utils.py 0.00% <0.00%> (-100.00%) :arrow_down:
smdebug/pytorch/collection.py 0.00% <0.00%> (-90.00%) :arrow_down:
smdebug/pytorch/hook.py 0.00% <0.00%> (-82.41%) :arrow_down:
smdebug/rules/action/stop_training_action.py 56.45% <0.00%> (-20.97%) :arrow_down:
smdebug/pytorch/utils.py 0.00% <0.00%> (-18.52%) :arrow_down:
smdebug/rules/req_tensors.py 79.16% <0.00%> (-11.12%) :arrow_down:
smdebug/core/tfevent/util.py 92.00% <0.00%> (-8.00%) :arrow_down:
smdebug/tensorflow/callable_cache.py 78.26% <0.00%> (-4.35%) :arrow_down:
smdebug/rules/action/action.py 91.83% <0.00%> (-4.09%) :arrow_down:
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1f5b06d...5a19fd0. Read the comment docs.

codecov-io avatar Oct 07 '20 19:10 codecov-io