maester icon indicating copy to clipboard operation
maester copied to clipboard

Track and document tags

Open svrooij opened this issue 6 months ago • 15 comments

I added the -Full tag, since this test can take a long time in enterprise tenants with 100k> apps. Originally posted by @merill in https://github.com/maester365/maester/issues/912#issuecomment-2869411288

I say we should document some of the tags somewhere. And maybe think of this problem the other way around. Meaning run this test by default and have people exclude a special tags we then give to tests that are known to fail for large tenants

svrooij avatar Jun 25 '25 15:06 svrooij

I completely agree that it would be helpful to track and standardize tags somehow/somewhere.

SamErde avatar Jul 01 '25 12:07 SamErde

@svrooij, can you please update the title of this issue to reflect the 'Track tags' topic? 🙏🙂

SamErde avatar Jul 08 '25 15:07 SamErde

Related questions:

  • What is the difference between the All tag and the Full tag? Does?
    • All = "all normal tests"
    • Full = "All + the tests that people don't normally include, such as the CA WhatIf"
  • Should all tests have the Full tag and most tests have the All tag?
  • Should it be assumed (and validated by Pester tests) that:
    • All tests in the Maester folder have the Maester tag
    • All tests in the CIS folder have the CIS tag
    • ...and so on?

@merill, @f-bader, @Cloud-Architekt

SamErde avatar Jul 08 '25 16:07 SamErde

So here's the difference.

I added Full for the tests that take a long time to run. Eg customers with 50k plus users will result in 5+ or 10+ hours to run.

These tests require all service principals to be downloaded which can be in millions in larger tenants.

The All tag was for tests that didn't take much time like the what if but the apis for it were in preview and not stable.

So there is a need to exclusively flag and run these types of tests.

Maybe we can have better names to flag these two types.

merill avatar Jul 08 '25 18:07 merill

So here's the difference.

I added Full for the tests that take a long time to run. Eg customers with 50k plus users will result in 5+ or 10+ hours to run.

These tests require all service principals to be downloaded which can be in millions in larger tenants.

The All tag was for tests that didn't take much time like the what if but the apis for it were in preview and not stable.

So there is a need to exclusively flag and run these types of tests.

Maybe we can have better names to flag these two types.

Oh, that helps immensely! How about these?

Current Tag New Tag
Full NoLimit
All Preview or PreviewAPI

This might also make it easier to use a new parameter called IgnoreTenantSizeLimits (or something to that effect) and include the NoLimit tag when that parameter switch is used.

We could add a warning (to the console and Maester-Action logs) when either of these are included in the Tag parameter. A more detailed status message could also be provided after Maester discovers the number of objects.

SamErde avatar Jul 08 '25 19:07 SamErde

NoLimit doesn't convey the meaning. This is what Gemini proposed. I like LongRunning, that way the user is saying to include the LongRunning tests...

Category Suggested Name Rationale
Focus on duration LongRunning This is the most direct and universally understood term in software development for this purpose.
Focus on Thoroughness Comprehensive This name frames the long duration in a positive light. It implies the tests are slow because they are exhaustive and provide deep validation, which is exactly your scenario.
Focus on Scale LargeScale This is a great choice because it describes why the tests are slow—they are designed to run against large-scale environments (like tenants with 50k+ users). It sets the context correctly.
Focus on Time TimeIntensive A slightly more formal and descriptive alternative to LongRunning. It clearly communicates that a significant time investment is required to run the test.
Focus on Separation Extended This name suggests that these tests are an "extension" of the standard, quicker test suite. It creates a clear mental separation between routine checks and deep-dive validations.
Focus on Caution Slow While very blunt, it is undeniably clear. It sets a stark and immediate expectation. This can be useful if you want to strongly discourage running these tests in typical CI/CD pipelines.

merill avatar Jul 14 '25 05:07 merill

That's a great breakdown!

LongRunning is great for this purpose. That gets my vote! I would also be OK with TimeIntensive.

LargeScale describes the environment more than the test itself, so we can eliminate that option. Comprehensive may create a sense of ambiguity about the "completeness" of the rest of the tests. 'Extendedis indeed a good option for optional tests that we haven't formally found a spot in the default set for, but still wanted to ship.Slow` may also be ambiguous about whether the test "will be run slowly" for throttling purposes or "is slow to run" because it is [ intensive, hits a slower API endpoint, pulls a large data set, etc ]."

Updated Proposal

Current Tag New Tag Usage
Full LongRunning Tests that take a long time to run and are not included by default.
All Preview Tests that are still in preview status or that use the preview API. (Is more distinction still needed?)
[All, null] Extended [I think recall seeing a few tests using the All tag for] tests that are not included in the default set of tests due to scope, standards, or limited adoption.

These changes would remove ambiguity and confusion around the current All and Full tags. It could also allow literal use of the All tag, although it would probably be most beneficial to encourage syntax like -Tag LongRunning,Preview,Extended to include those additional tests.

SamErde avatar Jul 14 '25 14:07 SamErde

I'm not even sure which tag is used by default if no tags are specified?

I feel more for a situation where it will always run all tests and then exclude tests that fail for large tenants by setting the Exclude to LongRunning (or whatever the conclusion of the sample table above)

svrooij avatar Jul 16 '25 19:07 svrooij

I'm not even sure which tag is used by default if no tags are specified?

I feel more for a situation where it will always run all tests and then exclude tests that fail for large tenants by setting the Exclude to LongRunning (or whatever the conclusion of the sample table above)

I feel like this point actually gets to the heart of the question. By default, all tests are run except for those tagged with All, Full, and a couple of others. Whether ironic or confusing, it's not a straightforward user experience.

I believe the intent was to run all except for those that are long-running/time-intensive or still in preview--it's just that the tags used to denote those are not intuitive.

As an aside, I feel that there is a similar gap in setting the expectation that Exchange Online and Teams will not be assessed unless the operator explicitly connects to those services (or uses -Service All) with the Connect-Maester command.

SamErde avatar Jul 16 '25 20:07 SamErde

So with Maester I want the first run experience to be a really, really, really good.

The first time someone tries Maester, they should go from 0 to wow in < 5 minutes.

Being able to run and immediately see the value is what gets folks into the Maester ecosystem.

This leads to the happy path and I believe is one of the reasons why Maester usage increases each month and we get new users and orgs into the Maester ecosystem.

After the first run, once they start digging in and noticing that a lot of tests are skipped we gently introduce them to the concept of installing the required modules and connecting. This is guided from the report UI which links to the docs on how to connect to the Exo, Az modules.

Now imagine the other experience:

We require them to install all the modules from currently Exchange, Teams, Azure (and the list will grow). Then they need to sign into each one, they may or may not have permissions to these other areas.

Then when they finally run it takes anywhere from 20 minutes to 10 hours on the first run (depending on the size of the tenant).

There's a really good chance most folks will give up or get distracted and move on to the next thing.

I hope that explains why I'm very pedantic and have strong opinion on making sure the first run by a new user has the path with the least friction and immediately shows the value of Maester.

Once they are over that bump and are sold and invested in Maester we can gently ramp them up into the advanced concepts and options.

merill avatar Jul 17 '25 09:07 merill

So I had a chat with @Cloud-Architekt and @f-bader and @Cloud-Architekt thought Comprehensive is a better flag to use since it is more clearer on the intent that we are stepping up and running the full gamut of tests.

The LongRunning option felt like something folks would want to avoid and wasn't clear about the benefit.

merill avatar Jul 18 '25 09:07 merill

Thank you for sharing the thought process behind this! What are your thoughts on this list of potential follow-up actions?

  • Update web site documentation for creating tests
  • Update test templates
  • Create project milestone (?) to include a coordinated release that includes:
    • Update test generation process for EIDSCA, ORCA, and CISA
    • Update existing tests
    • Update readme and built-in documentation

With changes to the existing tag usage:

  • Full to Comprehensive
  • All to Preview
  • Possibly document use cases for Extended

SamErde avatar Jul 18 '25 22:07 SamErde

Thought: Changing existing tags might be the equivalent of a breaking change for users who have automated testing that references specific tags. Should we semantically version it as such and trigger a v2.0 release?

SamErde avatar Aug 12 '25 17:08 SamErde

Preview releases of Maester vNext have new tags and a function called Get-MtTestInventory that returns a list of all tags and every test associated with each tag. These updates are described in the latest blog post, "What's New with Maester Tags."

@svrooij and @merill, what do you think about creating a GitHub workflow that runs Get-MtTestInventory and then builds a markdown table from the results? That could then be published to the documentation web site.

One additional consideration that we might want to cover is a historical glossary of tags. We might want to show tags that have been deprecated, and what date they were deprecated on; or show what date a new tag was added (discovered).

As an extension of that, I think there could be some benefits to publishing a prescribed set of tags for tests to choose from when they are included in the module. We could then use validation through JSON schema or Pester tests to ensure tag usage remains consistent and meaningful. (Custom tests would not necessarily have enforced tag taxonomy.)

SamErde avatar Oct 23 '25 20:10 SamErde

Love the idea. I also want to automate generating a page for every test, today we have manual authoring of this file and we don't have a page for every test.

Doing this will help us rank better in the SEO as well.

I think we can run Invoke-Maester, and just get the docs and markdown (excluding the pass/fail from it).

merill avatar Oct 25 '25 06:10 merill