DAOS-17308: ddb: add command 'setup' to setup the tmpfs and vos f…
Setup tmpfs and VOS file according to the pool information stored in SMD.
Steps for the author:
- [ ] Commit message follows the guidelines.
- [ ] Appropriate Features or Test-tag pragmas were used.
- [ ] Appropriate Functional Test Stages were run.
- [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
- [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.
After all prior steps are complete:
- [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).
Errors are component not formatted correctly,Ticket number prefix incorrect,PR title is malformatted. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data https://daosio.atlassian.net/browse/
Test stage Build on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/1/execution/node/339/log
Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/1/execution/node/343/log
Test stage Build RPM on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/1/execution/node/310/log
Test stage Build RPM on EL 8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/1/execution/node/307/log
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/1/execution/node/313/log
Test stage Build on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/2/execution/node/339/log
Test stage Build RPM on EL 8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/2/execution/node/288/log
Test stage Build RPM on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/2/execution/node/291/log
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16527/2/execution/node/333/log
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16527/7/testReport/
When you force push you are forcing your reviewers to look at all the code changes all over again. This can be frustrating as really you only want the reviewers to have to review the differential since the last time they looked.
When you force push you are forcing your reviewers to look at all the code changes all over again. This can be frustrating as really you only want the reviewers to have to review the differential since the last time they looked.
Sorry, My apologies, I'll use 'git commit --amend' to push my changes later.
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16527/8/testReport/
When you force push you are forcing your reviewers to look at all the code changes all over again. This can be frustrating as really you only want the reviewers to have to review the differential since the last time they looked.
Sorry, My apologies, I'll use 'git commit --amend' to push my changes later.
You don't have to use 'git commit --amend', when you make new commits to address review comments, you'd just leave them as separate commits (don't run 'git rebase' to squash them into the first commit, use 'git merge' instead when you want to merge from master), so that they'll be showed as stacked commits for review when you push them.
When you force push you are forcing your reviewers to look at all the code changes all over again. This can be frustrating as really you only want the reviewers to have to review the differential since the last time they looked.
Sorry, My apologies, I'll use 'git commit --amend' to push my changes later.
You don't have to use 'git commit --amend', when you make new commits to address review comments, you'd just leave them as separate commits (don't run 'git rebase' to squash them into the first commit, use 'git merge' instead when you want to merge from master), so that they'll be showed as stacked commits for review when you push them.
Now I clear how to push my change, Thanks.
@sherintg
For ddb to run correctly with pmdk, one needs to set PMEMOBJ_CONF=sds.at_create=0. Can this be automatically set by ddb in main?
- For md-on-pmem no recreation should take place. So,
sds.at_createhas no effect. - For md-on-ssd recreation is only for non-PMEMOBJ files:
- PMEMOBJ is used only for sysdb which is not recreated,
- VOS files use a different allocator.
Am I missing something?
Major thoughts:
- IMHO you do not have to recreate the daos_server formula for the tmpfs mount size. DDB need a big enough tmpfs to accommodate all the VOS pool files you recreate which would be a lot simpler to calculate and makes a lot more sense from the DDB perspective.
- The algorithm you introduced to (a) check whether the mountpoint is in user, (b) check whether it is tmpfs or not, and (c) check whether it is big enough for your purposes, is both complex and unnecessary for multiple reasons: (1) I assume the
prov_memcommand is a high-level administrative command and ifmount(8)allows to create multiple mount layers so should we, (2) if we decide multiple mount layers are probably an unintentional error we can callumount(2)on the mountpoint first, and (3) we should explain to the end user thatprov_memcreates a mount so it acts asmount(8)command and it is up to the administrator to provide reasonable arguments to this command.
Based on your opinion, should I remove the unnecessary automatic calculation formulas and directly use the system memory as the default value? If the user deems it unreasonable, then specify the expected value through parameters?
I am wondering how
ddb prov_memwill be used, maybe something like that:ddb prov_mem ddb pool1/vos-0 sub_cmd1 ddb poo11/vos-1 sub_cmd2 ...Means only once
prov_membefore the other ddb sub-commands. Do we need some environment reset after allddbtasks done? Or directlyumount tmpfsis enough?When
ddbunder interactive mode, do we need to automatically reset the environment when exitddbshell?@yunpeng-wang-panasas @NiuYawei
On the other hand, the
setupis some kind of general process for md-on-ssd case, not onlyddb, but alsodlck, and the normal engine start process. So can we make a general function and allow related users/components to call it directly without repeated work? @janekmi @yunpeng-wang-panasas
A simple usage:
[root@wang-runtime ~]# ddb
ddb version 2.7.101
ddb: prov_mem /var/daos/config/daos_control/engine0/ /mnt/daos
ddb:
After the prov_mem command is successfully executed, the target path(e.g: /mnt/daos) will become a tmpfs format mount point, and this path needs to be manually umount by the user. Apart from this, no other additional operations are required. By the way, User also can create a mountpoint on target path and then execute prov_mem.
This is not a repetitive task, prov_mem consists of two parts: mounting and generating vos files.
- The logic of mounting has been fully implemented in daos_server (ScmFormat), but this is implemented in go language and the coupling degree is too high. ddb tools cannot directly and simply call it, so I simply implemented it myself.
- The function for generating the vos file is the engine code that is called. This logic is in libmgmt.so, and ddb cannot directly call it, so the relevant function has been moved to libvos.so.
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16527/10/testReport/
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16527/10/testReport/
I am wondering how
ddb prov_memwill be used, maybe something like that:ddb prov_mem ddb pool1/vos-0 sub_cmd1 ddb poo11/vos-1 sub_cmd2 ...Means only once
prov_membefore the other ddb sub-commands. Do we need some environment reset after allddbtasks done? Or directlyumount tmpfsis enough? Whenddbunder interactive mode, do we need to automatically reset the environment when exitddbshell? @yunpeng-wang-panasas @NiuYawei On the other hand, thesetupis some kind of general process for md-on-ssd case, not onlyddb, but alsodlck, and the normal engine start process. So can we make a general function and allow related users/components to call it directly without repeated work? @janekmi @yunpeng-wang-panasasA simple usage: [root@wang-runtime ~]# ddb ddb version 2.7.101 ddb: prov_mem /var/daos/config/daos_control/engine0/ /mnt/daos ddb:After the prov_mem command is successfully executed, the target path(e.g: /mnt/daos) will become a tmpfs format mount point, and this path needs to be manually umount by the user. Apart from this, no other additional operations are required. By the way, User also can create a mountpoint on target path and then execute prov_mem.
This is not a repetitive task, prov_mem consists of two parts: mounting and generating vos files.
1. The logic of mounting has been fully implemented in daos_server (ScmFormat), but this is implemented in go language and the coupling degree is too high. ddb tools cannot directly and simply call it, so I simply implemented it myself. 2. The function for generating the vos file is the engine code that is called. This logic is in libmgmt.so, and ddb cannot directly call it, so the relevant function has been moved to libvos.so.
Thanks @yunpeng-wang-panasas for the explanation. I mean that it will be more convenient to export an VOS API that package vos part work for md-on-ssd setup, such as the path generation, vos file cleanup, pre-allocation, and so on, then the other user, such as dlck can directly call such API without repeated work. I notice that @janekmi does similar work in his dlck patch, then for some reminder.
I am wondering how
ddb prov_memwill be used, maybe something like that:ddb prov_mem ddb pool1/vos-0 sub_cmd1 ddb poo11/vos-1 sub_cmd2 ...Means only once
prov_membefore the other ddb sub-commands. Do we need some environment reset after allddbtasks done? Or directlyumount tmpfsis enough? Whenddbunder interactive mode, do we need to automatically reset the environment when exitddbshell? @yunpeng-wang-panasas @NiuYawei On the other hand, thesetupis some kind of general process for md-on-ssd case, not onlyddb, but alsodlck, and the normal engine start process. So can we make a general function and allow related users/components to call it directly without repeated work? @janekmi @yunpeng-wang-panasasA simple usage: [root@wang-runtime ~]# ddb ddb version 2.7.101 ddb: prov_mem /var/daos/config/daos_control/engine0/ /mnt/daos ddb:After the prov_mem command is successfully executed, the target path(e.g: /mnt/daos) will become a tmpfs format mount point, and this path needs to be manually umount by the user. Apart from this, no other additional operations are required. By the way, User also can create a mountpoint on target path and then execute prov_mem. This is not a repetitive task, prov_mem consists of two parts: mounting and generating vos files.
1. The logic of mounting has been fully implemented in daos_server (ScmFormat), but this is implemented in go language and the coupling degree is too high. ddb tools cannot directly and simply call it, so I simply implemented it myself. 2. The function for generating the vos file is the engine code that is called. This logic is in libmgmt.so, and ddb cannot directly call it, so the relevant function has been moved to libvos.so.Thanks @yunpeng-wang-panasas for the explanation. I mean that it will be more convenient to export an VOS API that package vos part work for md-on-ssd setup, such as the path generation, vos file cleanup, pre-allocation, and so on, then the other user, such as dlck can directly call such API without repeated work. I notice that @janekmi does similar work in his dlck patch, then for some reminder.
@Nasf-Fan I'm afraid that's out of scope. The VOS files management is now in mgmt module, I'm not quite sure if it's a good idea to move it into VOS. Even if we all agree on moving that part into VOS, I think it should be done in a separate task, what do you think?
I am wondering how
ddb prov_memwill be used, maybe something like that:ddb prov_mem ddb pool1/vos-0 sub_cmd1 ddb poo11/vos-1 sub_cmd2 ...Means only once
prov_membefore the other ddb sub-commands. Do we need some environment reset after allddbtasks done? Or directlyumount tmpfsis enough? Whenddbunder interactive mode, do we need to automatically reset the environment when exitddbshell? @yunpeng-wang-panasas @NiuYawei On the other hand, thesetupis some kind of general process for md-on-ssd case, not onlyddb, but alsodlck, and the normal engine start process. So can we make a general function and allow related users/components to call it directly without repeated work? @janekmi @yunpeng-wang-panasasA simple usage: [root@wang-runtime ~]# ddb ddb version 2.7.101 ddb: prov_mem /var/daos/config/daos_control/engine0/ /mnt/daos ddb:After the prov_mem command is successfully executed, the target path(e.g: /mnt/daos) will become a tmpfs format mount point, and this path needs to be manually umount by the user. Apart from this, no other additional operations are required. By the way, User also can create a mountpoint on target path and then execute prov_mem. This is not a repetitive task, prov_mem consists of two parts: mounting and generating vos files.
1. The logic of mounting has been fully implemented in daos_server (ScmFormat), but this is implemented in go language and the coupling degree is too high. ddb tools cannot directly and simply call it, so I simply implemented it myself. 2. The function for generating the vos file is the engine code that is called. This logic is in libmgmt.so, and ddb cannot directly call it, so the relevant function has been moved to libvos.so.Thanks @yunpeng-wang-panasas for the explanation. I mean that it will be more convenient to export an VOS API that package vos part work for md-on-ssd setup, such as the path generation, vos file cleanup, pre-allocation, and so on, then the other user, such as dlck can directly call such API without repeated work. I notice that @janekmi does similar work in his dlck patch, then for some reminder.
@Nasf-Fan I'm afraid that's out of scope. The VOS files management is now in mgmt module, I'm not quite sure if it's a good idea to move it into VOS. Even if we all agree on moving that part into VOS, I think it should be done in a separate task, what do you think?
Personally, I do not think that mgmt is a suitable place for backend operation, such as mkdir/fallocate/sync, and backend layout, such as NEWBORNS/ZOMBIES, and etc. All these should be hidden inside VOS. Otherwise, there will be multiple direct operators for the same backend storage. Instead, I prefer to handle all related things via some VOS APIs, then these can be shared by regular DAOS process, such as target create, md-on-ssh start setup, and also can be reused by DAOS backend utils, such as DLCK and DDB.
From DLCK and DDB perspective, they only care about the backend storage, should not depend on other module(s). But because of current code organization, it has to link some non-related module(s) for compile, that is not reasonable.
It is true that above suggestion involves a lot of changes, out of this patch scope. But current situation is that @janekmi is working on DLCK, in his patch, there is also similar md-on-ssd setup process. I would suggest them to share the same logic to avoid repeated work or merge trouble in future.
I am wondering how
ddb prov_memwill be used, maybe something like that:ddb prov_mem ddb pool1/vos-0 sub_cmd1 ddb poo11/vos-1 sub_cmd2 ...Means only once
prov_membefore the other ddb sub-commands. Do we need some environment reset after allddbtasks done? Or directlyumount tmpfsis enough? Whenddbunder interactive mode, do we need to automatically reset the environment when exitddbshell? @yunpeng-wang-panasas @NiuYawei On the other hand, thesetupis some kind of general process for md-on-ssd case, not onlyddb, but alsodlck, and the normal engine start process. So can we make a general function and allow related users/components to call it directly without repeated work? @janekmi @yunpeng-wang-panasasA simple usage: [root@wang-runtime ~]# ddb ddb version 2.7.101 ddb: prov_mem /var/daos/config/daos_control/engine0/ /mnt/daos ddb:After the prov_mem command is successfully executed, the target path(e.g: /mnt/daos) will become a tmpfs format mount point, and this path needs to be manually umount by the user. Apart from this, no other additional operations are required. By the way, User also can create a mountpoint on target path and then execute prov_mem. This is not a repetitive task, prov_mem consists of two parts: mounting and generating vos files.
1. The logic of mounting has been fully implemented in daos_server (ScmFormat), but this is implemented in go language and the coupling degree is too high. ddb tools cannot directly and simply call it, so I simply implemented it myself. 2. The function for generating the vos file is the engine code that is called. This logic is in libmgmt.so, and ddb cannot directly call it, so the relevant function has been moved to libvos.so.Thanks @yunpeng-wang-panasas for the explanation. I mean that it will be more convenient to export an VOS API that package vos part work for md-on-ssd setup, such as the path generation, vos file cleanup, pre-allocation, and so on, then the other user, such as dlck can directly call such API without repeated work. I notice that @janekmi does similar work in his dlck patch, then for some reminder.
@Nasf-Fan I'm afraid that's out of scope. The VOS files management is now in mgmt module, I'm not quite sure if it's a good idea to move it into VOS. Even if we all agree on moving that part into VOS, I think it should be done in a separate task, what do you think?
Personally, I do not think that mgmt is a suitable place for backend operation, such as mkdir/fallocate/sync, and backend layout, such as NEWBORNS/ZOMBIES, and etc. All these should be hidden inside VOS. Otherwise, there will be multiple direct operators for the same backend storage. Instead, I prefer to handle all related things via some VOS APIs, then these can be shared by regular DAOS process, such as target create, md-on-ssh start setup, and also can be reused by DAOS backend utils, such as
DLCKandDDB.From
DLCKandDDBperspective, they only care about the backend storage, should not depend on other module(s). But because of current code organization, it has to link some non-related module(s) for compile, that is not reasonable.It is true that above suggestion involves a lot of changes, out of this patch scope. But current situation is that @janekmi is working on
DLCK, in his patch, there is also similar md-on-ssd setup process. I would suggest them to share the same logic to avoid repeated work or merge trouble in future.
Yes, I think we could make the change (Moving the share part of DLCK and DDB into VOS or a common shared lib) in a separate task (PR).
I am wondering how
ddb prov_memwill be used, maybe something like that:ddb prov_mem ddb pool1/vos-0 sub_cmd1 ddb poo11/vos-1 sub_cmd2 ...Means only once
prov_membefore the other ddb sub-commands. Do we need some environment reset after allddbtasks done? Or directlyumount tmpfsis enough? Whenddbunder interactive mode, do we need to automatically reset the environment when exitddbshell? @yunpeng-wang-panasas @NiuYawei On the other hand, thesetupis some kind of general process for md-on-ssd case, not onlyddb, but alsodlck, and the normal engine start process. So can we make a general function and allow related users/components to call it directly without repeated work? @janekmi @yunpeng-wang-panasasA simple usage: [root@wang-runtime ~]# ddb ddb version 2.7.101 ddb: prov_mem /var/daos/config/daos_control/engine0/ /mnt/daos ddb:After the prov_mem command is successfully executed, the target path(e.g: /mnt/daos) will become a tmpfs format mount point, and this path needs to be manually umount by the user. Apart from this, no other additional operations are required. By the way, User also can create a mountpoint on target path and then execute prov_mem. This is not a repetitive task, prov_mem consists of two parts: mounting and generating vos files.
1. The logic of mounting has been fully implemented in daos_server (ScmFormat), but this is implemented in go language and the coupling degree is too high. ddb tools cannot directly and simply call it, so I simply implemented it myself. 2. The function for generating the vos file is the engine code that is called. This logic is in libmgmt.so, and ddb cannot directly call it, so the relevant function has been moved to libvos.so.Thanks @yunpeng-wang-panasas for the explanation. I mean that it will be more convenient to export an VOS API that package vos part work for md-on-ssd setup, such as the path generation, vos file cleanup, pre-allocation, and so on, then the other user, such as dlck can directly call such API without repeated work. I notice that @janekmi does similar work in his dlck patch, then for some reminder.
@Nasf-Fan I'm afraid that's out of scope. The VOS files management is now in mgmt module, I'm not quite sure if it's a good idea to move it into VOS. Even if we all agree on moving that part into VOS, I think it should be done in a separate task, what do you think?
Personally, I do not think that mgmt is a suitable place for backend operation, such as mkdir/fallocate/sync, and backend layout, such as NEWBORNS/ZOMBIES, and etc. All these should be hidden inside VOS. Otherwise, there will be multiple direct operators for the same backend storage. Instead, I prefer to handle all related things via some VOS APIs, then these can be shared by regular DAOS process, such as target create, md-on-ssh start setup, and also can be reused by DAOS backend utils, such as
DLCKandDDB. FromDLCKandDDBperspective, they only care about the backend storage, should not depend on other module(s). But because of current code organization, it has to link some non-related module(s) for compile, that is not reasonable. It is true that above suggestion involves a lot of changes, out of this patch scope. But current situation is that @janekmi is working onDLCK, in his patch, there is also similar md-on-ssd setup process. I would suggest them to share the same logic to avoid repeated work or merge trouble in future.Yes, I think we could make the change (Moving the share part of DLCK and DDB into VOS or a common shared lib) in a separate task (PR).
It looks like we did not think through this aspect before starting DLCK and enabling md-on-ssd for DDB (what this PR aims to deliver). I am not convinced these bits belong with VOS. VOS is already quite bloated. I am sure if we introduce new APIs in VOS they will stay there for a long time.
I think the least disruptive thing we can do right now is to put bits which have to be shared among the interested parties (daos_server, daos_engine, ddb, dlck) into dedicated compilation units which will have the least possible amount of dependencies so it will be easy to compile them wherever they are required (*). And it should be done properly by this PR (FYI @yunpeng-wang-panasas ) and/or my PR (#16550) whoever does it first. Please note it will be also relatively easy to merge this way too.
(*) This is also true for header files. vos.h is not a place to put things where we do not have a better place for them. If it is not the VOS interface these bits do not belong there. I think creating dedicated headers, even temporarily till we come up with some good design here, is a better idea.
At the later time we can discuss whether these commonly shared bits belong with VOS or maybe constitute some new accessory library or it is fine as it is.
Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16527/10/testReport/
FYI @NiuYawei @yunpeng-wang-panasas @Nasf-Fan as agreed I prepared a PR just moving around bits necessary for both DDB and DLCK. It should be easy for both of them to just compile with the newly created compilation unit.
Please review as soon as possible to unblock this PR: https://github.com/daos-stack/daos/pull/16779
I heavily reused work already done by @yunpeng-wang-panasas here. I marked you as co-author and wherever it made sense I also added Vdura copyrights.
@yunpeng-wang-panasas here you can find my proposition how can you separate bits which may be of use for both DDB and DLCK which you have wrote (different from the bits I reorganize in #16779): https://github.com/janekmi/daos/commit/eaac118bb564695928d532d20748a1a13c4603c9
Making them a separate module DLCK can easily use them if necessary.
I am ok if you decide just cherry-pick this commit or reorganize the code yourself how you see fit.
@yunpeng-wang-panasas here you can find my proposition how can you separate bits which may be of use for both DDB and DLCK which you have wrote (different from the bits I reorganize in #16779): janekmi@eaac118
Making them a separate module DLCK can easily use them if necessary.
I am ok if you decide just cherry-pick this commit or reorganize the code yourself how you see fit.
@janekmi I'll cherry-pick this commit (eaac118). Thanks a lot.
BTW, I saw that the PR DAOS-17939 (https://github.com/daos-stack/daos/pull/16779) has already landed. And I'm trying to rebase my branch. But it seems that I still depended on libmgmt.so when I try to call ds_mgmt_tgt_recreate. Is there any commit in DLCK that can let me know how to call this function without libmgmt.so?