Split plan to multiple plans by --max tests
Pull Request Checklist
- [ ] implement the feature
- [ ] write the documentation
- [ ] extend the test coverage
- [ ] update the specification
- [ ] adjust plugin docstring
- [ ] modify the json schema
- [ ] mention the version
- [ ] include a release note
Splitting is easy. Running them will be hard.
$ hatch run dev:tmt -c trigger=commit -vv run discover plan -n '^/plans/features/core$' /var/tmp/tmt/run-006 Found 1 plan. /plans/features/core summary: Verify core functionality discover how: fmf directory: /home/happz/git/tmt hash: 78742f93 filter: tier: 0, 1 & tag:-provision-only summary: 41 tests selected /tests/backward-compatibility /tests/core/adjust /tests/core/context /tests/core/debug /tests/core/docs /tests/core/dry /tests/core/enabled /tests/core/env /tests/core/environment-file /tests/core/error /tests/core/escaping /tests/core/feeling-safe /tests/core/fmf-id /tests/core/force /tests/core/link /tests/core/logging /tests/core/ls /tests/core/order /tests/core/path /tests/core/phases /tests/core/plan-env-file /tests/core/smoke /tests/core/spaces /tests/core/web-link /tests/init/base /tests/init/full /tests/init/git /tests/init/mini /tests/init/nested /tests/lint/all /tests/lint/plan/explicit-root /tests/lint/plan/implicit-root /tests/lint/story /tests/lint/test /tests/plan/create /tests/plan/show /tests/story/create /tests/story/show /tests/test/create /tests/test/show /tests/unit/with-development-packages/basic$ hatch run dev:tmt -c trigger=commit -vv run --max 10 discover plan -n '^/plans/features/core$' /var/tmp/tmt/run-007 Found 1 plan. /plans/features/core summary: Verify core functionality discover how: fmf directory: /home/happz/git/tmt hash: 78742f93 filter: tier: 0, 1 & tag:-provision-only summary: 41 tests selected /tests/backward-compatibility /tests/core/adjust /tests/core/context /tests/core/debug /tests/core/docs /tests/core/dry /tests/core/enabled /tests/core/env /tests/core/environment-file /tests/core/error /tests/core/escaping /tests/core/feeling-safe /tests/core/fmf-id /tests/core/force /tests/core/link /tests/core/logging /tests/core/ls /tests/core/order /tests/core/path /tests/core/phases /tests/core/plan-env-file /tests/core/smoke /tests/core/spaces /tests/core/web-link /tests/init/base /tests/init/full /tests/init/git /tests/init/mini /tests/init/nested /tests/lint/all /tests/lint/plan/explicit-root /tests/lint/plan/implicit-root /tests/lint/story /tests/lint/test /tests/plan/create /tests/plan/show /tests/story/create /tests/story/show /tests/test/create /tests/test/show /tests/unit/with-development-packages/basic /plans/features/core.1 summary: Verify core functionality discover status: done summary: 10 tests selected /tests/backward-compatibility /tests/core/adjust /tests/core/context /tests/core/debug /tests/core/docs /tests/core/dry /tests/core/enabled /tests/core/env /tests/core/environment-file /tests/core/error /plans/features/core.2 summary: Verify core functionality discover status: done summary: 10 tests selected /tests/core/escaping /tests/core/feeling-safe /tests/core/fmf-id /tests/core/force /tests/core/link /tests/core/logging /tests/core/ls /tests/core/order /tests/core/path /tests/core/phases /plans/features/core.3 summary: Verify core functionality discover status: done summary: 10 tests selected /tests/core/plan-env-file /tests/core/smoke /tests/core/spaces /tests/core/web-link /tests/init/base /tests/init/full /tests/init/git /tests/init/mini /tests/init/nested /tests/lint/all /plans/features/core.4 summary: Verify core functionality discover status: done summary: 10 tests selected /tests/lint/plan/explicit-root /tests/lint/plan/implicit-root /tests/lint/story /tests/lint/test /tests/plan/create /tests/plan/show /tests/story/create /tests/story/show /tests/test/create /tests/test/show /plans/features/core.5 summary: Verify core functionality discover status: done summary: 1 test selected /tests/unit/with-development-packages/basic
Another approach could be to separate only the report step? That way the prepare and finish steps can be shared. execute splitting might not be desired since it would lack prepare/finish, in which case it should use the full splitting like here.
Another approach could be to separate only the
reportstep? That way theprepareandfinishsteps can be shared.executesplitting might not be desired since it would lackprepare/finish, in which case it should use the full splitting like here.
I'm afraid I don't follow, can you elaborate?
It depends on what the goal for splitting the plans is. If it's for running the tests in parallel then this approach where all test steps are separated prepare, execute, finish, etc.
This can be wasteful if the prepare or finish steps are demanding, e.g. compiling a project. In that case the user would primarily want just the report step to be separate such that testing-farm can group them for navigability.
After some tweaks, this prototype works. It's bad, it's crude, and it probably breaks a dozen "must-have" assumptions mentioned in #308, but I'd call it a start. Each subplan gets its own workdir, all subplans have different names, there is no smart aggregation, the original plan still appears in the output, there's no sharing of guests, all plans provision all guests, not just those needed for their tests, the split is based purely on the number of tests discovered, and so on and on and on.
Multiple other methods were proposed for splitting, other use cases, and possibly even combinations of methods (separate by test-level hardware key, sort shared HW requirements into same guests, then apply --max, and so on...)
It depends on what the goal for splitting the plans is. If it's for running the tests in parallel then this approach where all test steps are separated
prepare,execute,finish, etc.
Plan-level parallelization, i.e. tmt performing more than one plan at the same time, is beyond tmt capabilities right now. It will be needed though - Testing Farm does it and tmt will need to learn to take over this functionality, and some use cases for plan splitting will benefit greatly (and some wouldn't even make much sense without it).
Some of the use cases for the splitting itself mentioned in #308:
- moving
hardwarekey from plan'sprovisionto test metadata. tmt then should be capable of putting tests with the same (or compatible...)hardwarerequirements into the same "buckets", then run subplans with subsets of tests. - mutually exclusive tests - test A cannot run on the same guest as test B, it would be easier to store this information in test metadata rather than in plan metadata, tmt would then sort them into corresponding buckets, "cloning" the rest of the plan steps.
-
--maxaka splitting a large plan into smaller chunks - this one definitely calls for plan-level parallelization: running a large plan "per partes", use more guests to get results quicker.
This can be wasteful if the
prepareorfinishsteps are demanding, e.g. compiling a project. In that case the user would primarily want just thereportstep to be separate such that testing-farm can group them for navigability.
I assume this scenario is one of those that would benefit from the plan-level parallelization, e.g. one subplan doing its expensive prepare while other subplans are already crunching tests. I'd say this can be already set up with careful use of when, but it'd hit the serial nature of tmt running one plan after another. Definitely related, although I would say we need another issue for this feature, to track it on its own, because even though the yare related, both can be developed separately.
Nice, I like the plan_shapers.
I wonder if there is a different approach to the
TestOrigin: TypeAlias = tuple[str, 'tmt.Test']in order to avoid all thefor _, test ..., but I presume there is a reason phase name is not an attribute of tmt.Test.
I for one would like to have a container similar to TestInvocation, to bundle together the test, its "parent" phase, and one or two other bits. It would be possible to put all these into a Test instance, but that does not feel right to me, as it's mixing test metadata with the bigger view the test itself does not necessarily need to be aware of. The test's phase is important for code moving the test around, but it does not seem to be important for any code inside Test, hence "outside", at least for now.
Just want to clarify the expectations: As for now, we execute plans in sequence, so currently this will not speed up testing. But once we have implemented parallel execution for plans this could be enabled in Testing Farm and
tmtwould take care of the parallel execution (completely), e.g. this will be working with the new pipeline only, right?
Correct, I guess. Yes, this patch alone will not bring any speedup with tmt alone. It's not part of the current implementation, but it should be possible to extend the functionality to be available in the discover step as well, then TF would benefit from the splitting too (if TF ever allows passing --max to tmt discover). Which I think will be needed anyway, to address the "and how can I re-run just one of these "virtual" plans?" question.
Once - if... - tmt learns to run multiple plans at the same time, then yes, tmt alone would be able to crunch multiple plans at once, including the split ones.
Correct, I guess. Yes, this patch alone will not bring any speedup with tmt alone. It's not part of the current implementation, but it should be possible to extend the functionality to be available in the
discoverstep as well, then TF would benefit from the splitting too (if TF ever allows passing--maxtotmt discover). Which I think will be needed anyway, to address the "and how can I re-run just one of these "virtual" plans?" question.
Isn't Testing Farm running tmt plans ls first to check which plans are available? I think that part cannot / should not be covered by the plan splitting. Or?
Btw, I've noticed a warning about an invalid schema when running the discover step separately:
> tmt run --max 3 discover
/var/tmp/tmt/run-145
warn: User is feeling safe.
/plan
discover
how: shell
summary: 5 tests selected
Splitting plan to batches of 3 tests.
warn: /plan:discover - {'how': 'shell', 'tests': [{'name': 'Test 1', 'test': 'echo "1"'}, {'name': 'Test 2', 'test': 'echo "2"'}, {'name': 'Test 3', 'test': 'echo "3"'}, {'name': 'Test 4', 'test': 'echo "4"'}, {'name': 'Test 5', 'test': 'echo "5"'}], 'name': 'default-0', 'feeling-safe': True} is not valid under any of the given schemas
warn: /plan:discover - {'how': 'shell', 'tests': [{'name': 'Test 1', 'test': 'echo "1"'}, {'name': 'Test 2', 'test': 'echo "2"'}, {'name': 'Test 3', 'test': 'echo "3"'}, {'name': 'Test 4', 'test': 'echo "4"'}, {'name': 'Test 5', 'test': 'echo "5"'}], 'name': 'default-0', 'feeling-safe': True} is not valid under any of the given schemas
/plan.1
discover
status: done
summary: 3 tests selected
/plan.2
discover
status: done
summary: 2 tests selected
When the full tmt run is performed, the original plan is still included in the output with no results found in the report step and no guests provisioned in the finish step:
/plan
discover
how: shell
summary: 5 tests selected
Splitting plan to batches of 3 tests.
report
how: display
summary: no results found
finish
warn: Nothing to finish, no guests provisioned.
Shall we somehow skip the report and finish step for the original plans?
Correct, I guess. Yes, this patch alone will not bring any speedup with tmt alone. It's not part of the current implementation, but it should be possible to extend the functionality to be available in the
discoverstep as well, then TF would benefit from the splitting too (if TF ever allows passing--maxtotmt discover). Which I think will be needed anyway, to address the "and how can I re-run just one of these "virtual" plans?" question.Isn't Testing Farm running
tmt plans lsfirst to check which plans are available? I think that part cannot / should not be covered by the plan splitting. Or?Btw, I've noticed a warning about an invalid schema when running the
discoverstep separately:> tmt run --max 3 discover /var/tmp/tmt/run-145 warn: User is feeling safe. /plan discover how: shell summary: 5 tests selected Splitting plan to batches of 3 tests. warn: /plan:discover - {'how': 'shell', 'tests': [{'name': 'Test 1', 'test': 'echo "1"'}, {'name': 'Test 2', 'test': 'echo "2"'}, {'name': 'Test 3', 'test': 'echo "3"'}, {'name': 'Test 4', 'test': 'echo "4"'}, {'name': 'Test 5', 'test': 'echo "5"'}], 'name': 'default-0', 'feeling-safe': True} is not valid under any of the given schemas warn: /plan:discover - {'how': 'shell', 'tests': [{'name': 'Test 1', 'test': 'echo "1"'}, {'name': 'Test 2', 'test': 'echo "2"'}, {'name': 'Test 3', 'test': 'echo "3"'}, {'name': 'Test 4', 'test': 'echo "4"'}, {'name': 'Test 5', 'test': 'echo "5"'}], 'name': 'default-0', 'feeling-safe': True} is not valid under any of the given schemas
This is really weird, I wouldn't expect tmt to create invalid plans, and it looks like there's some kind of leak of fields: feeling-safe shouldn't be there. I suspect copy.deepcopy() is causing this mess.
When the full
tmt runis performed, the original plan is still included in the output withno results foundin the report step andno guests provisionedin the finish step:/plan discover how: shell summary: 5 tests selected Splitting plan to batches of 3 tests. report how: display summary: no results found finish warn: Nothing to finish, no guests provisioned.Shall we somehow skip the
reportandfinishstep for the original plans?
I think we should. But, I'm poking the area of "what to do when quitting" from so many angles that I absolutely need to clean the buffer of pending PRs waiting to be merged before I dare to change more lines. Rebase was already painful enough, and I'm not even speaking about my memory, struggling to keep myself in the picture :(
I think we should. But, I'm poking the area of "what to do when quitting" from so many angles that I absolutely need to clean the buffer of pending PRs waiting to be merged before I dare to change more lines. Rebase was already painful enough, and I'm not even speaking about my memory, struggling to keep myself in the picture :(
Understood. So this would be a material for a separate pull request.
This is really weird, I wouldn't expect tmt to create invalid plans, and it looks like there's some kind of leak of fields:
feeling-safeshouldn't be there. I suspectcopy.deepcopy()is causing this mess.
Yeah, this part is really weird. Sounds like we should better cover it here. Do you feel it the same? Or also address separately?
This is really weird, I wouldn't expect tmt to create invalid plans, and it looks like there's some kind of leak of fields:
feeling-safeshouldn't be there. I suspectcopy.deepcopy()is causing this mess.Yeah, this part is really weird. Sounds like we should better cover it here. Do you feel it the same? Or also address separately?
I cannot reproduce the issue:
$ tmt -vv run --max 3 discover plan -n '^/plans/features/core$'
/var/tmp/tmt/run-218
Found 1 plan.
/plans/features/core
summary: Verify core functionality
discover
how: fmf
directory: /home/happz/git/tmt
hash: 312b16c9
filter: tier: 0, 1 & tag:-provision-only
summary: 41 tests selected
/tests/backward-compatibility
/tests/core/adjust
/tests/core/context
/tests/core/debug
/tests/core/docs
/tests/core/dry
/tests/core/enabled
/tests/core/env
/tests/core/environment-file
/tests/core/error
/tests/core/escaping
/tests/core/feeling-safe
/tests/core/fmf-id
/tests/core/force
/tests/core/link
/tests/core/logging
/tests/core/ls
/tests/core/order
/tests/core/path
/tests/core/phases
/tests/core/plan-env-file
/tests/core/smoke
/tests/core/spaces
/tests/core/web-link
/tests/init/base
/tests/init/full
/tests/init/git
/tests/init/mini
/tests/init/nested
/tests/lint/all
/tests/lint/plan/explicit-root
/tests/lint/plan/implicit-root
/tests/lint/story
/tests/lint/test
/tests/plan/create
/tests/plan/show
/tests/story/create
/tests/story/show
/tests/test/create
/tests/test/show
/tests/unit/with-development-packages/basic
Splitting plan to batches of 3 tests.
/plans/features/core.1
summary: Verify core functionality
discover
status: done
summary: 3 tests selected
/tests/backward-compatibility
/tests/core/adjust
/tests/core/context
/plans/features/core.2
summary: Verify core functionality
discover
status: done
summary: 3 tests selected
/tests/core/debug
/tests/core/docs
/tests/core/dry
...
Plus, there is no copy.copy() :O Current and derived plan share Step instances... So, I'm out of ideas.
Nevermind, setting "feeling safe" does the trick. I can reproduce it now, looking into it.
Nevermind, setting "feeling safe" does the trick. I can reproduce it now, looking into it.
Step subcommands of tmt do accept --feeling-safe option:
@classmethod
def options(cls, how: Optional[str] = None) -> list[tmt.options.ClickOptionDecoratorType]:
"""
Prepare command line options for given method
"""
# Include common options supported across all plugins
return [
metadata.option
for _, _, _, _, metadata in (
container_field(cls.get_data_class(), key)
for key in container_keys(cls.get_data_class())
)
if metadata.option is not None
] + (
tmt.options.VERBOSITY_OPTIONS
+ tmt.options.FORCE_DRY_OPTIONS
+ tmt.options.AGAIN_OPTION
+ tmt.options.FEELING_SAFE_OPTION
)
I'm not sure it's the right behavior, I would say "no" and let it only for run, but the effect of this is that discover plus TMT_FEELING_SAFE sets this option of the discover phases. It's not related to the changes in this PR, and it seems like a bigger fish to fry. IIUIC, my patch just makes this "leak" visible, the option exists, it's extracted from the environment and saved in all phases, which I'd consider an undesirable behavior. To me, this spells "another patch".
I'm not sure it's the right behavior, I would say "no" and let it only for
run,
Hm, I would not expect this. I thought that --feeling-safe is only available in the main tmt command. Not even in tmt run. Just at the top level. Seems it was added in 908d4c4992f0524935570fd4da3f7fe3814f8f09. @martinhoyer, do you remember why it was added to the individual plugin options as well?
IIUIC, my patch just makes this "leak" visible, the option exists, it's extracted from the environment and saved in all phases, which I'd consider an undesirable behavior. To me, this spells "another patch".
Yeah, agreed. Let's handle that separately. And thanks for the investigation!
Seems it was added in 908d4c4. @martinhoyer, do you remember why it was added to the individual plugin options as well?
I think I just assumed that's how it's supposed to work after seeing dry/verbose/again there, sorry.
@happz, planning to cover any remaining items from the checklist? I guess a short release note would make sense?
@happz, planning to cover any remaining items from the checklist? I guess a short release note would make sense?
I didn't plan to, and I can happily live without a release note given that --max itself is it's pretty much a toy without any practical effect.
@happz It is not clear to me how this affects report step? Is there a single report or split report? IOW if I do reportportal report and configure it to report run per suite will I get N suites (where N is from --max N) or just a single one?
@happz It is not clear to me how this affects report step? Is there a single report or split report? IOW if I do reportportal report and configure it to report run per suite will I get N suites (where N is from --max N) or just a single one?
Each new plan gets its report step from their "parent", reportportal would run as many times. Yes, I think it's possible you will end up with several suites, as there is no aggregation.
I wouldn't worry about it for now, there is no advantage in splitting plans like this, it's rather a testing tool for when working on bigger fish, like plan of plans, importing multiple plans and so on. By some accident, it got its own release note, but practical usability is low right now.
Ah, so it does not do multiple provision+execution in parallel?
Ah, so it does not do multiple provision+execution in parallel?
Exactly, it does not. It splits plans, but tmt does not know how to run more plans at the same time, hence no advantage. As I said, announcing it as something user-worthy was not a good idea from this PoV :)
Heh, I see, thanks for clarification :)