Extra index parameter added to Scenario Outline Example methods in Reqnroll 3.x
Reqnroll Version
3.2.1
Which test runner are you using?
NUnit
Test Runner Version Number
5.1.0.0
.NET Implementation
.NET 8.0
Test Execution Method
Azure DevOps Pipeline Task – PLEASE SPECIFY THE NAME OF THE TASK
Content of reqnroll.json configuration file
No response
Issue Description
After upgrading from Reqnroll 2.4.1 → 3.x, generated test methods for Scenario Outline Examples now include an additional index parameter. This changes the method signature and breaks our Azure DevOps test mapping, which relies on the previous parameter order.
This behaviour does not appear in the migration guide or release notes.
Method that used to be this:
ValidateZeroInputForAgeLimit("Over","-10","The specified condition was not met for 'Age Limits'.", null)
Now includes an index, which breaks mapping to Azure Devops:
ValidateZeroInputForAgeLimit("Over","-10","The specified condition was not met for 'Age Limits'.","1", null)
Steps to Reproduce
Create any scenario outline with multiple Examples and build. Upgrade to 3.2.1 and build. Compare the generated test names in Test Explorer.
Link to Repro Project
No response
That extra parameter is called the 'pickle index' and indicates which test (pickle) this row corresponds to. This is used by the Formatters system.
your recommendation is to manually update THOUSANDS of test mappings? i don't suppose there is a way to simply turn it off?
I'd love to understand this mapping in more detail.
The method signatures generated by the Reqnroll system are something of an implementation detail, but I appreciate some non-gherkin tools use them as a way to identify tests.
If your mapping looks as generated test names (not method signatures), we could look to find a way to let you generate custom names for parameterised tests which would match the "old" format you're tied to.
In Visual Studio 2022 the user right clicks the test in test explorer and chooses "Associate to Test Case" to map. Whatever string i see in test explorer gets written to the test case in Azure, and Azure uses this string to match them up (which seems ridiculous to me, but that's msft for you)
So for example this reqnroll:
Scenario Outline: Add two numbers
Given I have a calculator
And I have entered "<number1>" into the calculator
And I have entered "<number2>" into the calculator
When I press "<operation>"
Then the result should be "<expected>"
Examples:
| number1 | number2 | operation | expected |
| 5 | 3 | add | 8 |
| 10 | 2 | add | 12 |
Will generate two test names in Test Explorer with version 2.4.1
AddTwoNumbers("10","2","add","12",null)
AddTwoNumbers("5","3","add","8",null)
but in version 3.2.1 produces these strings
AddTwoNumbers("10","2","add","12","1",null)
AddTwoNumbers("5","3","add","8","0",null)
When running the test in Azure the pipeline is looking up the workitem with Associated Test Name AddTwoNumbers("10","2","add","12",null) and finding nothing because the new name generated by the build is AddTwoNumbers("10","2","add","12","1",null)
This doc isn't the best, but it does explain something... https://learn.microsoft.com/en-us/azure/devops/test/associate-automated-test-with-test-case?view=azure-devops&utm_source=chatgpt.com
I hope this helps. My only viable solution at this time is to write a powershell script that updates work items with the new values, but that seems a little bit risky to me. Doing manually will take forever.
I understand the problem. We have also used this feature earlier, but I don't think Microsoft ever intended that to be used for hundreds of tests. Unfortunately as an open-source project we cannot afford that we never change the signatures of the generated methods. We try to minimize these changes though.
I suggest you stick to v2 or use the Azure DevOps API to update the associations as a batch. (Azure DevOps simply stores this data as a normal fields that can be updated with the Work Item Update API).
your recommendation is to manually update THOUSANDS of test mappings?
..... My only viable solution at this time is to write a powershell script that updates work items with the new values, but that seems a little bit risky to me. Doing manually will take forever.
I don't want to trivialize your problem, but you could use CoPilot for it :)
This behaviour does not appear in the migration guide or release notes.
Maybe it's indeed a nice to idea to add this to the migration guide / release notes.
This behaviour does not appear in the migration guide or release notes.
Maybe it's indeed a nice to idea to add this to the migration guide / release notes.
Good point. I have included it now to https://github.com/reqnroll/Reqnroll/releases/tag/v3.0.0 (last point at breaking changes)
I could probably use Copilot to assist with re-writing the mappings on the work items using the devops API, but is that really good enough? These pickle index values are not guaranteed to remain static if the test changes. What's worse, the indexes do not restart per scenario outline but rather for the feature file. For data driven tests still in development adding a new example or even a new Scenario Outline to a feature could throw off the indexes and break the mappings all over again.
So if we go with the workaround you propose I should
- update existing mappings
- never change the existing Examples or add examples ever
- never add a Scenario Outline to an existing feature file
- BEST PRACTICE for Azure Devops - stop using Scenario Outline - Examples in future tests and consider "locked" all existing feature files that use Scenario Outline - Examples (after successfully re-mapping them)
Perhaps it is better to flatten the tests using the setting below? We would still need to script updating mappings but at least it would take the pickle problem away for good (if I'm understanding things?)
"generator": { "allowRowTests": false }
I have checked and setting generator/allowRowTests to false makes the names deterministic and compatible with v2.
These pickle index values are not guaranteed to remain static if the test changes. What's worse, the indexes do not restart per scenario outline but rather for the feature file. For data driven tests still in development adding a new example or even a new Scenario Outline to a feature could throw off the indexes and break the mappings all over again.
That is a good point, but hard to fix. (The fix would be a breaking change again.) I don't know how many others are tight to the exact values (not even only the method names). If only a few, then the proposed workaround of setting allowRowTests to false is maybe the real solution to this problem. When we designed the concept of pikcleIndex, we left it as "string" so that we can use a better identifier later, but we never considered that these should be fully permanent. Especially because the scenario names are also not permanent, so it would not be a great help to make them so. Even if we would use a local example index, it would be quickly broken without being noticed if someone inserts a new example or changes their order.
I think if a process requires permanent names than probably those could be better achieved by tagging the scenarios with an ID that you generate yourself (e.g. @id:3487697583) and make a generator plugin that wraps the generated methods with a method name generated from the identifier (e.g. test_3487697583). I don't know.
generator/allowRowTests looks like it uses the first column of an Examples table instead of complete list of parameters, which is fine except for tests where the first column has duplicates and then it goes back to using an index (e.g., AddTwoNumbers_Variant0, AddTwoNumbers_Variant1, etc) but the good news is that indexing in this case restarts per scenario.
But I still need to re-map my tests whichever choice I make. And in no case are the names guaranteed static.
I'm betting anyone who runs CI in Azure Devops pipelines will have this issue. How many others is that?
I'm interested in exploring a "generator plugin" but worried I'll just need to maintain it.
No way to just have a legacy mode that just ignores the pickle index? I guess I don't understand why it is needed...
I'm betting anyone who runs CI in Azure Devops pipelines will have this issue. How many others is that?
I think the vast majority of the users just run the tests with a dotnet test or a VsTest task based on the assembly. This mapping is only needed when you want to run the tests using the VsTest TestPlan/TestSuite based execution model, that has a couple of other problems as well, so I don't think many people use it for executing BDD scenarios.
Test names are also used to track the history of a test and if/when failures happened or started. No matter what "fix" we choose our history is gone. And even if we were using like "vast majority" (which i question) our history would not be maintained, would it?
Another remote possibility is to establish a custom .runsettings file. This has to be at a solution level, so there are limitations.
- Create an xml file at the solution level, named .runsettings
- Insert the following content (to be customized to your needs):
<?xml version="1.0" encoding="utf-8"?>
<RunSettings>
<NUnit>
<DefaultTestNamePattern>{m}({0},{1},{2})</DefaultTestNamePattern>
</NUnit>
</RunSettings>
- Upgrade your test project dependency of
Microsoft.NET.Test.Sdkto version 18.0.1 (I found this didn't work on earlier) On a simple test project I was running, after a rebuild and prompting the Test Explorer to Run All, the display of test names updated to format dictated by the.runsettingsfile.
See: NUnit documentation on custom test naming templating and MS documentation
The pattern, such as, {m}({0},{1},{2}) unfortunately has to be customized per Scenario Outline (or at least those that contain a unique number of test method parameters). That implies that you can only have a single Scenario Outline per solution, which admittedly is ugly. The pattern to use would need just the parameter placeholders (such as {1}) for the functional values and exclude the next to last parameter (which is the pickleIndex).
To sum up my comment in #867
We do a similiar mapping and experienced a behaviour alike the new index break with out automatic mapping system and especially for scenario outlines this still appears despite applying the claimed fix, which works only for normal scenarios. I could just update our tooling to ignore the last two parameters of a scenario instead of the single one, but one issue remains:
We don't know from the .trx/nunit xml files, if a feature was generated with the new or old version of Specflow/Reqnroll and wether we need to ignore the last one or two arguments from the scenario method. As such our AzureDevops Pipeline is also not able to report those tests right now until having a fix. On the short run we will need to downgrade.
Furthermore the original issue of the thread creator also affects us heavily. All those links in TFS gone. no ability to retest bugfixes by the developers as all links are broken and not to mention: What, if a test was originally failing with the old reqnroll version and now need to be executed again with the new version to verify that the bug got fixed? Do I need to have two links in the bug now?
Whatever you do to fix the issue: Just keep it as before (from what I saw it is not possible due to changed requirements by the VSTestAdapter??) and if not possible at all, at least allow an identification of those crippled, changed files so that a script can deal with it. But the breaking history is gonna affect us also a lot.
The clrudolphi solution is impossible. We are talking about several thousands of tests, bugs,..documents and can as such fully understand the interest of regnalebnela.
Whatever you do to fix the issue: Just keep it as before
I don't think that aggressive communication helps the situation. Please keep in mind that you are using an open-source project that is maintained by unpaid contributors. We do our best to help the community, but you are not in a position to give us orders.
Let me summarize the situation:
-
We have changed the signature of the methods generated from scenario outlines in v3 because the reporting infrastructure needed it. (Being able to identify the exact scenario outline example was a long term issue anyway.) Whether the value we use for identification was the best choice that is a question, but the need for the additional parameter is clear and we have not found any alternative solution for it.
-
We could make the additional parameter optional and disallow reporting if not enabled, but besides the complexity, we would introduce a complex dependency chain and continuous support/maintenance efforts, that is something we cannot take on, especially because there are satisfactory workarounds. This is something I am strongly against. The reporting is the new feature of v3, if someone don't want to use is that is basically the v2 line.
-
The generated method names and parameter values are never going to be deterministic. You are loud now about this change, but the same problem happens if a) someone fixes a typo in the scenario name b) adds a tag on the examples block c) reorders the examples list d) fixes a value in the examples table e) changes the scenario name f) changes the feature name g) moves the scenario to another feature file. If your process requires full traceability of historic executions than you have to solve it for yourself by providing unique IDs as tags or somehow else (see my previous comment). Reqnroll never intended to provide such functionality.
-
Reqnroll provides an option to make every scenario example to be identifiable as an individual test method by setting the
generator/allowRowTeststofalse. With that the scenario outline examples get a unique method name that is only influenced by the first column of the examples table. Using the first column to specify some unique keys let you have method names that will at least not depend on the other values of the examples table, so they are more permanent. But still, changing the scenario and the feature name will break the continuity, so these are still not guarantee you the backwards traceability. -
You need to understand that this change was not done by mistake, but it is something we needed. I understand the frustration and the extra work it caused, but this is the nature of breaking changes. We did our best to keep these changes for major version changes only, following our semantic-versioning policy. Suggesting things like "I don't care just revert it" are not helpful. If it would have been a simple mistake that we can simply revert, we would do that, for sure. This is a community and everyone has to contribute to it. You are allowed and welcome to make constructive suggestions, but sometimes this also means that you need to do a bit of extra work on your side - that's the price you pay for using an open-source product for free.
-
Based on what we have seen in this thread so far, I can see the following options: A) Stick to v2. Feel free to port back any important change to v2 line and send a PR to the
line-v2branch about them, we can make service releases to v2 if necessary. B) Setgenerator/allowRowTeststofalseand use the unique methods generated for scenario outline examples C) Fork Reqnroll and make any changes you want (our open-source license allows this) D) Suggest any concrete change (best is if using a PR) that allows using reporting without the extra parameter. F) Suggest any concrete change (best is if using a PR) that provides more optimal pickle index values (this will not change the need of the additional parameter, but the values will be less dependent on other changes in the feature file).
#3 is misleading, because changing a method name or adding an example is not the same thing. When i make a change, then it is a known thing and I can update accordingly. But if i make 1 change in a feature and the indexes all recalculate the issue cascades to the rest of the tests in the feature potentially breaking dozens of tests (or more) and it is not obvious which ones will break. It is not PREDICTABLE.
I don't say this just to argue with you. I say this to highlight the importance of changing generator/allowRowTests to false when upgrading to 3.x because the naming will be more predictable. Users should at least test this before going through the work of re-mapping stuff.
Thank you for all of this clarification @gasparnagy and sorry for the inappropiate tone. I sometimes struggle to deliver a message of my long posts and wanted to clarify my two cents of opinion hereby of what I assume are the only feasible options.
That you consider it as a main feature of Reqnroll 3 has not reached me. I didn't get, that it was a major update from 2.4.0 to 3.0.3+, but assumed our last update from 3.0.3 to 3.11 broke the reporting, which would have been a subminor change. I definetely haven't read enough of the release notes and really missed that essential feature. Absolutely my fault.
We knew about the reporting issue all the years and have found our way of reporting it by additionally reading the scenario arguments and scenario names in a script, recreating the method names generated and matching them to the test result files of NUnit/MsTest. As such we were able to tell: Were all methods of the feature executed correctly or did they fail/ended with StepPendingException?
You obvisouly made some deeper thoughts with into the reporting of specflow itself and I haven't seen the functionality so far, but we were some kind of able to make it work without those pickle index values (which matches to the interests of seeing the history of a test in AzureDevOps server). So the question remains on where we can help to get this approach into Reqnroll?
And to show, why I assume the argument of regnalebnala is so damn relevant, way more important than my interest: This is a graph you have currently for every single parameterization in Azure Devops Server:
You know exactly when a test started to fail and especially if it fails regularly or not. By selecting a execution that is also red, I can check, why and what happened there. especially useful for instable test implementation or changes that caused the error. It helps every developer to know, if a failure ever happened before and when. All of this is identified by the test name. an you can even link every test with a bug. It is a great and important feature to keep using such tests. This track record would get lost, whenever a feature gets edited even by the smallest parameterization. Right now it get's only changed when there is a change on this specific scenario parameterization. So we know: hey, the history ends, so this test didn't exist before. There is usually a reason someone did it and you can quite easily trace and identify it. So it's a pretty big deal for any kind of organization I'd assume. So for companies, that all arguments in a method thing could either be a really necessary, or we find a way to match without it.
Do you have a guidance on files, where this tracing got used for you reporting functionality, so that we can easier check out, whether we have an alterative idea and support the project theirby?
To keep some track onto it: I found (using the term "pickleindex") the method generating it: public void AddVariableForPickleIndex(CodeMemberMethod testMethod, bool pickleIdIncludedInParameters, int? pickleIndex) with some unit tests for it. But haven't found the part, that parses those files later so far. Only further saw: https://github.com/reqnroll/Reqnroll/blob/4c9ef2d3cea84361122891da706ce5062f0dc34b/Reqnroll/Formatters/ExecutionTracking/FeatureExecutionTracker.cs#L113 That tracks accordingly. Is this the only place? I assumed you have some post processing onto those values. Hereby it looks only like something an argument matcher could do (cause, those are the only difference in scenario examples)
Thx @regnalebnala and @StarWars999123 for the comments. I agree that the fact that the parameters vary by changes of other scenarios in the file is not good. We will need to find a solution for this. But this will only optimize the pickleIndex values, but we will still need the pickleIndex.
That tracks accordingly. Is this the only place? I assumed you have some post processing onto those values. Hereby it looks only like something an argument matcher could do (cause, those are the only difference in scenario examples)
Yes, I think this is the only place. We need to be able to match the current execution with the pickle ID that was generated for the execution during compile-time. (In essence we would need to know which example was used to generate the current scenario outline execution.)
I don't know what you mean exactly by an argument matcher, but the problem is that you can have multiple example lines with exactly the same values - in that case just by the values you cannot tell which one is executing now.
@gasparnagy You mean, that people really create a scenario outline and define the same parameters twice for the same scenario?
Scenario Outline: MyScenario
Given I press
Scenario Outline: OtherScenario Given something else.. Examples: | Parameter| | A | | B |
What's the idea of such example dupllications as in MyScenario? We would have considered it at least to be a bad test design habit, but I would expect it to be even illegal in other test frameworks. Especially since I could only see a use for it, if people rely on a certain execution order for this, which accoing to my experience can not be guaranteed for frameworks.
I assumed, that one could generate generate methods out of it and track during execution a "pickle", if you wanna call it alike, that is just: MyScenario("A"), MySceario("B"), OtherScenario("A") as well as OtherScenario("B"). Such a pickle could stay internal (it wouldn't add any argument for the test case name and not be visible in the TFS), but not made public to the functionnames. I would assume, that's how it was done before. Was this short pickle also some kind of necessary performance improvement for the tests tracing?
What's the idea of such example dupllications as in MyScenario? We would have considered it at least to be a bad test design habit, but I would expect it to be even illegal in other test frameworks. Especially since I could only see a use for it, if people rely on a certain execution order for this, which accoing to my experience can not be guaranteed for frameworks.
I assumed, that one could generate generate methods out of it and track during execution a "pickle", if you wanna call it alike, that is just: MyScenario("A"), MySceario("B"), OtherScenario("A") as well as OtherScenario("B"). Such a pickle could stay internal (it wouldn't add any argument for the test case name and not be visible in the TFS), but not made public to the functionnames. I would assume, that's how it was done before. Was this short pickle also some kind of necessary performance improvement for the tests tracing?
Well. You are right this does not make much sense in this form. Some ppl just use scenario outlines to have multiple runs of the same test... :( But I have seen slightly more valuable cases, like:
Scenario Outline: My Scenario
Given I press
Then then the button should be enabled
Examples: Core examples
| Parameter |
| A |
| B |
Examples: Positive examples
| Parameter |
| A |
| C |
Examples: Negative examples
| Parameter |
| B |
| D |
Of course we have the right to say that these are not allowed from now on, but we need to validate this carefully.
Background: The Gherkin specification and the expected cucumber-like behavior is not decided by us directly, but on the cucumber project. The terms like "pickles" are coming from Cucumber. Since we would like to emit Cucumber Messages test result files, we need to have a mapping to the model that Cucumber uses.
But in the meantime we had some discussion on Discord and @clrudolphi had an idea to solve this that is prototyped as #938. Let's see what comes out of that. (His idea is somewhat at the some direction as yours - make some compromise on the duplicated example handling in favor of the normal usage.)