Track and document tags
I added the -Full tag, since this test can take a long time in enterprise tenants with 100k> apps. Originally posted by @merill in https://github.com/maester365/maester/issues/912#issuecomment-2869411288
I say we should document some of the tags somewhere. And maybe think of this problem the other way around. Meaning run this test by default and have people exclude a special tags we then give to tests that are known to fail for large tenants
I completely agree that it would be helpful to track and standardize tags somehow/somewhere.
@svrooij, can you please update the title of this issue to reflect the 'Track tags' topic? 🙏🙂
Related questions:
- What is the difference between the
Alltag and theFulltag? Does?All= "all normal tests"Full= "All + the tests that people don't normally include, such as the CA WhatIf"
- Should all tests have the
Fulltag and most tests have theAlltag? - Should it be assumed (and validated by Pester tests) that:
- All tests in the
Maesterfolder have theMaestertag - All tests in the
CISfolder have theCIStag - ...and so on?
- All tests in the
@merill, @f-bader, @Cloud-Architekt
So here's the difference.
I added Full for the tests that take a long time to run. Eg customers with 50k plus users will result in 5+ or 10+ hours to run.
These tests require all service principals to be downloaded which can be in millions in larger tenants.
The All tag was for tests that didn't take much time like the what if but the apis for it were in preview and not stable.
So there is a need to exclusively flag and run these types of tests.
Maybe we can have better names to flag these two types.
So here's the difference.
I added Full for the tests that take a long time to run. Eg customers with 50k plus users will result in 5+ or 10+ hours to run.
These tests require all service principals to be downloaded which can be in millions in larger tenants.
The All tag was for tests that didn't take much time like the what if but the apis for it were in preview and not stable.
So there is a need to exclusively flag and run these types of tests.
Maybe we can have better names to flag these two types.
Oh, that helps immensely! How about these?
| Current Tag | New Tag |
|---|---|
| Full | NoLimit |
| All | Preview or PreviewAPI |
This might also make it easier to use a new parameter called IgnoreTenantSizeLimits (or something to that effect) and include the NoLimit tag when that parameter switch is used.
We could add a warning (to the console and Maester-Action logs) when either of these are included in the Tag parameter. A more detailed status message could also be provided after Maester discovers the number of objects.
NoLimit doesn't convey the meaning. This is what Gemini proposed. I like LongRunning, that way the user is saying to include the LongRunning tests...
| Category | Suggested Name | Rationale |
|---|---|---|
| Focus on duration | LongRunning |
This is the most direct and universally understood term in software development for this purpose. |
| Focus on Thoroughness | Comprehensive |
This name frames the long duration in a positive light. It implies the tests are slow because they are exhaustive and provide deep validation, which is exactly your scenario. |
| Focus on Scale | LargeScale |
This is a great choice because it describes why the tests are slow—they are designed to run against large-scale environments (like tenants with 50k+ users). It sets the context correctly. |
| Focus on Time | TimeIntensive |
A slightly more formal and descriptive alternative to LongRunning. It clearly communicates that a significant time investment is required to run the test. |
| Focus on Separation | Extended |
This name suggests that these tests are an "extension" of the standard, quicker test suite. It creates a clear mental separation between routine checks and deep-dive validations. |
| Focus on Caution | Slow |
While very blunt, it is undeniably clear. It sets a stark and immediate expectation. This can be useful if you want to strongly discourage running these tests in typical CI/CD pipelines. |
That's a great breakdown!
LongRunning is great for this purpose. That gets my vote! I would also be OK with TimeIntensive.
LargeScale describes the environment more than the test itself, so we can eliminate that option.
Comprehensive may create a sense of ambiguity about the "completeness" of the rest of the tests.
'Extendedis indeed a good option for optional tests that we haven't formally found a spot in the default set for, but still wanted to ship.Slow` may also be ambiguous about whether the test "will be run slowly" for throttling purposes or "is slow to run" because it is [ intensive, hits a slower API endpoint, pulls a large data set, etc ]."
Updated Proposal
| Current Tag | New Tag | Usage |
|---|---|---|
Full |
LongRunning |
Tests that take a long time to run and are not included by default. |
All |
Preview |
Tests that are still in preview status or that use the preview API. (Is more distinction still needed?) |
[All, null] |
Extended |
[I think recall seeing a few tests using the All tag for] tests that are not included in the default set of tests due to scope, standards, or limited adoption. |
These changes would remove ambiguity and confusion around the current All and Full tags. It could also allow literal use of the All tag, although it would probably be most beneficial to encourage syntax like -Tag LongRunning,Preview,Extended to include those additional tests.
I'm not even sure which tag is used by default if no tags are specified?
I feel more for a situation where it will always run all tests and then exclude tests that fail for large tenants by setting the Exclude to LongRunning (or whatever the conclusion of the sample table above)
I'm not even sure which tag is used by default if no tags are specified?
I feel more for a situation where it will always run all tests and then exclude tests that fail for large tenants by setting the
ExcludetoLongRunning(or whatever the conclusion of the sample table above)
I feel like this point actually gets to the heart of the question. By default, all tests are run except for those tagged with All, Full, and a couple of others. Whether ironic or confusing, it's not a straightforward user experience.
I believe the intent was to run all except for those that are long-running/time-intensive or still in preview--it's just that the tags used to denote those are not intuitive.
As an aside, I feel that there is a similar gap in setting the expectation that Exchange Online and Teams will not be assessed unless the operator explicitly connects to those services (or uses -Service All) with the Connect-Maester command.
So with Maester I want the first run experience to be a really, really, really good.
The first time someone tries Maester, they should go from 0 to wow in < 5 minutes.
Being able to run and immediately see the value is what gets folks into the Maester ecosystem.
This leads to the happy path and I believe is one of the reasons why Maester usage increases each month and we get new users and orgs into the Maester ecosystem.
After the first run, once they start digging in and noticing that a lot of tests are skipped we gently introduce them to the concept of installing the required modules and connecting. This is guided from the report UI which links to the docs on how to connect to the Exo, Az modules.
Now imagine the other experience:
We require them to install all the modules from currently Exchange, Teams, Azure (and the list will grow). Then they need to sign into each one, they may or may not have permissions to these other areas.
Then when they finally run it takes anywhere from 20 minutes to 10 hours on the first run (depending on the size of the tenant).
There's a really good chance most folks will give up or get distracted and move on to the next thing.
I hope that explains why I'm very pedantic and have strong opinion on making sure the first run by a new user has the path with the least friction and immediately shows the value of Maester.
Once they are over that bump and are sold and invested in Maester we can gently ramp them up into the advanced concepts and options.
So I had a chat with @Cloud-Architekt and @f-bader and @Cloud-Architekt thought Comprehensive is a better flag to use since it is more clearer on the intent that we are stepping up and running the full gamut of tests.
The LongRunning option felt like something folks would want to avoid and wasn't clear about the benefit.
Thank you for sharing the thought process behind this! What are your thoughts on this list of potential follow-up actions?
- Update web site documentation for creating tests
- Update test templates
- Create project milestone (?) to include a coordinated release that includes:
- Update test generation process for EIDSCA, ORCA, and CISA
- Update existing tests
- Update readme and built-in documentation
With changes to the existing tag usage:
FulltoComprehensiveAlltoPreview- Possibly document use cases for
Extended
Thought: Changing existing tags might be the equivalent of a breaking change for users who have automated testing that references specific tags. Should we semantically version it as such and trigger a v2.0 release?
Preview releases of Maester vNext have new tags and a function called Get-MtTestInventory that returns a list of all tags and every test associated with each tag. These updates are described in the latest blog post, "What's New with Maester Tags."
@svrooij and @merill, what do you think about creating a GitHub workflow that runs Get-MtTestInventory and then builds a markdown table from the results? That could then be published to the documentation web site.
One additional consideration that we might want to cover is a historical glossary of tags. We might want to show tags that have been deprecated, and what date they were deprecated on; or show what date a new tag was added (discovered).
As an extension of that, I think there could be some benefits to publishing a prescribed set of tags for tests to choose from when they are included in the module. We could then use validation through JSON schema or Pester tests to ensure tag usage remains consistent and meaningful. (Custom tests would not necessarily have enforced tag taxonomy.)
Love the idea. I also want to automate generating a page for every test, today we have manual authoring of this file and we don't have a page for every test.
Doing this will help us rank better in the SEO as well.
I think we can run Invoke-Maester, and just get the docs and markdown (excluding the pass/fail from it).