cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Linstor 4.19 fix script alllines

Open rp- opened this issue 1 year ago • 3 comments

Description

This PR enabled draining on the AllLinesParser so Script.execute() doesn't timeout if it has to read larger data. Right now commands with larger output and AllLinesParser won't finish, because noone is emptying the stdout buffer and the process will never exit until killed by the timeout "interrupt".

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
  • [ ] build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [ ] Major
  • [ ] Minor

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [x] Major
  • [ ] Minor
  • [ ] Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Tested with local Script executions and an added unittest.

rp- avatar Feb 16 '24 12:02 rp-

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 30.94%. Comparing base (6f3e4e6) to head (5588d8f). Report is 52 commits behind head on 4.19.

Additional details and impacted files
@@             Coverage Diff              @@
##               4.19    #8670      +/-   ##
============================================
+ Coverage     30.90%   30.94%   +0.03%     
- Complexity    34187    34229      +42     
============================================
  Files          5347     5347              
  Lines        375578   375579       +1     
  Branches      54629    54629              
============================================
+ Hits         116063   116212     +149     
+ Misses       244245   244079     -166     
- Partials      15270    15288      +18     
Flag Coverage Δ
simulator-marvin-tests 24.82% <0.00%> (+0.06%) :arrow_up:
uitests 4.39% <ø> (ø)
unit-tests 16.56% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 16 '24 14:02 codecov[bot]

So this must be included to allow for disaster recovery on large scale installations I read. Why is that?

DaanHoogland avatar Feb 23 '24 08:02 DaanHoogland

So this must be included to allow for disaster recovery on large scale installations I read. Why is that?

I use Script to read drbdsetup command output to check resource states and if the stdout gets a bit larger (depends on buffer settings of the system), the script.execute() + AllLinesParser simply timeout, because nobody is reading the output buffer to EOF.

rp- avatar Feb 23 '24 08:02 rp-

Also wanted to mention this fix is more a must have for 4.19.1, otherwise any slightly larger Linstor setup will break. Because you can't fully disable HA: https://github.com/apache/cloudstack/issues/8682

rp- avatar Mar 20 '24 12:03 rp-

@sureshanaparti I think we can merge this after smoke tests

DaanHoogland avatar Mar 25 '24 09:03 DaanHoogland

@blueorangutan package

DaanHoogland avatar Mar 25 '24 09:03 DaanHoogland

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Mar 25 '24 09:03 blueorangutan

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9025

blueorangutan avatar Mar 25 '24 10:03 blueorangutan

@blueorangutan test

sureshanaparti avatar Mar 25 '24 13:03 sureshanaparti

@sureshanaparti a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan avatar Mar 25 '24 13:03 blueorangutan

[SF] Trillian test result (tid-9576) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 52562 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8670-t9576-kvm-centos7.zip Smoke tests completed. 127 look OK, 2 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_05_vmschedule_test_e2e Failure 361.86 test_vm_schedule.py
test_01_redundant_vpc_site2site_vpn Failure 405.04 test_vpc_vpn.py

blueorangutan avatar Mar 26 '24 04:03 blueorangutan

@blueorangutan test alma9 kvm-alma9

DaanHoogland avatar Mar 29 '24 13:03 DaanHoogland

@DaanHoogland a [SL] Trillian-Jenkins test job (alma9 mgmt + kvm-alma9) has been kicked to run smoke tests

blueorangutan avatar Mar 29 '24 13:03 blueorangutan

[SF] Trillian test result (tid-9619) Environment: kvm-alma9 (x2), Advanced Networking with Mgmt server a9 Total time taken: 53658 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8670-t9619-kvm-alma9.zip Smoke tests completed. 129 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File

blueorangutan avatar Mar 30 '24 04:03 blueorangutan