osparc-simcore icon indicating copy to clipboard operation
osparc-simcore copied to clipboard

✨Autoscaling: add a delay before draining a node

Open sanderegg opened this issue 1 year ago • 2 comments

What do these changes do?

This PR adds a delay (EC2_INSTANCES_TIME_BEFORE_DRAINING) before draining active nodes (e.g. EC2 in the cluster that are available to run services) so that in case a dynamic sidecar would fail on start it has time to restart and is not rejected.

driving tests:

test_set_node_found_empty
test_cluster_scaling_up_and_down

bonus:

  • some cleanup, test coverage still at 97%

Related issue/s

  • fixes https://github.com/ITISFoundation/osparc-simcore/issues/5842

How to test

make devenv
source .venv/bin/activate
cd services/autoscaling
make install-dev
make test-unit-dev

Dev-ops checklist

  • new ENV EC2_INSTANCES_TIME_BEFORE_DRAINING and WORKERS_EC2_INSTANCES_TIME_BEFORE_DRAINING that take in a timedelta between 10 seconds and 1 minute (auto-capped)
  • https://git.speag.com/oSparc/osparc-ops-deployment-configuration/-/merge_requests/598
  • [x] No ENV changes or I properly updated ENV (read the instruction)

sanderegg avatar May 17 '24 06:05 sanderegg

Codecov Report

Attention: Patch coverage is 94.54545% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 87.6%. Comparing base (cafbf96) to head (5cffffd). Report is 213 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #5843      +/-   ##
=========================================
+ Coverage    84.5%   87.6%    +3.1%     
=========================================
  Files          10    1367    +1357     
  Lines         214   56790   +56576     
  Branches       25    1284    +1259     
=========================================
+ Hits          181   49801   +49620     
- Misses         23    6715    +6692     
- Partials       10     274     +264     
Flag Coverage Δ
integrationtests 65.1% <ø> (?)
unittests 85.6% <94.5%> (+1.0%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...gs-library/src/settings_library/docker_registry.py 95.6% <100.0%> (ø)
...g/src/simcore_service_autoscaling/core/settings.py 100.0% <100.0%> (ø)
...oscaling/src/simcore_service_autoscaling/models.py 100.0% <100.0%> (ø)
.../simcore_service_autoscaling/utils/utils_docker.py 100.0% <100.0%> (ø)
...c/simcore_service_clusters_keeper/core/settings.py 96.2% <100.0%> (ø)
.../simcore_service_clusters_keeper/utils/clusters.py 97.5% <ø> (ø)
...e_service_autoscaling/modules/auto_scaling_core.py 94.2% <87.5%> (ø)

... and 1336 files with indirect coverage changes

codecov[bot] avatar May 17 '24 07:05 codecov[bot]