DAOS-17659 control: Enable setting property value with multiple strings
Fix issue where a property value containing multiple comma-separated strings cannot be set e.g. self_heal:exclude,rebuild.
Features: control
Steps for the author:
- [ ] Commit message follows the guidelines.
- [ ] Appropriate Features or Test-tag pragmas were used.
- [ ] Appropriate Functional Test Stages were run.
- [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
- [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.
After all prior steps are complete:
- [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).
Ticket title is 'Unable to set pool property self_heal to exclude,rebuild'
Status is 'In Review'
https://daosio.atlassian.net/browse/DAOS-17659
Test stage Build RPM on EL 8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/1/execution/node/330/log
Test stage Build RPM on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/1/execution/node/334/log
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/1/execution/node/327/log
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/1/testReport/
Thanks a lot for the quick responses, everyone.
This patch doesn't feel right---hope I'm not misleading:
- The
,does not require escaping as far as the shell is concerned. - The shell removes the escaping before our code sees the arguments. (See my Python experiment below. I also tried C just in case Python is too smart, but I got the same result.)
- We need to document (in
dmghelp or man page?) what syntaxes are supported for setting pool propertyself_heal. - The unit test doesn't reflect the real inputs from the shell.
% python -c 'import sys; print(sys.argv)' a b=c d=e,f g='h i' j='k,l' n=o\,p
['-c', 'a', 'b=c', 'd=e,f', 'g=h i', 'j=k,l', 'n=o,p']
Yeah, after applying this patch, I got the same errors when setting self_heal to exclude,rebuild:
bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:exclude
pool set-prop succeeded
bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:exclude,rebuild
ERROR: dmg: invalid property "rebuild" (must be key:val)
bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:'exclude,rebuild'
ERROR: dmg: invalid property "rebuild" (must be key:val)
bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:exclude\,rebuild
ERROR: dmg: invalid property "rebuild" (must be key:val)
This worked, however:
bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:\'exclude,rebuild\'
pool set-prop succeeded
Sounds like the set-prop syntax design [<key:val[,key:val...]>] doesn't work well with the self_heal value syntax design flag0[,flag1...].
Yeah, after applying this patch, I got the same errors when setting
self_healtoexclude,rebuild:bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:exclude pool set-prop succeeded bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:exclude,rebuild ERROR: dmg: invalid property "rebuild" (must be key:val) bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:'exclude,rebuild' ERROR: dmg: invalid property "rebuild" (must be key:val) bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:exclude\,rebuild ERROR: dmg: invalid property "rebuild" (must be key:val)This worked, however:
bash-5.1$ ~/daos/install/bin/dmg -o ~/daos_control.yml -i pool set-prop p0 self_heal:\'exclude,rebuild\' pool set-prop succeededSounds like the
set-propsyntax design[<key:val[,key:val...]>]doesn't work well with theself_healvalue syntax designflag0[,flag1...].
If escaping single or double quotes isn't acceptable should the parser check if the option following the comma matches compatible flag name? try match keyval else try match flag? what do you think?
If escaping single or double quotes isn't acceptable should the parser check if the option following the comma matches compatible flag name? try match keyval else try match flag? what do you think?
Could we use some other set of characters to wrap the values (maybe [])? Otherwise, like @liw said, it feels like we made a design mistake somewhere along the line.
I'm not completely against users having to escape the quotes, but they may find it a bit annoying (and hard to figure out what they did wrong if they don't escape them).
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/2/testReport/
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/2/testReport/
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/2/execution/node/1454/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/2/execution/node/1544/log
Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/3/testReport/
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/3/execution/node/1479/log
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/4/testReport/
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/4/execution/node/1644/log
I think this is ready to land, it's passed all CI hardware test stages after multiple run from stage tries. NLT failures look unrelated and approvals have been done. Requesting gatekeeper.
@daos-stack/daos-gatekeeper can this be landed given we've got all green on CI and NLT is unrelated? otherwise please let me know if I need to rerun. TIA
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/9/testReport/
Test stage Build RPM on EL 8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/12/execution/node/326/log
Test stage Build RPM on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/12/execution/node/329/log
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/12/execution/node/342/log
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/14/testReport/
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/15/testReport/
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/16/testReport/
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16503/17/execution/node/1510/log
Test stage Functional Hardware Large MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16503/17/testReport/
Looks like all stages actually completed. Failures are known issues:
- EcIor/ior_smoke.py: https://daosio.atlassian.net/browse/DAOS-17842
- DaosCoreTest/test_daos_rebuild_ec: https://daosio.atlassian.net/browse/DAOS-17091
CI run 18 passed all, removing force landing label