zig icon indicating copy to clipboard operation
zig copied to clipboard

Idea: Track bugs via tests to solve the problem of stale bug reports

Open Zirunis opened this issue 1 month ago • 2 comments

I know language proposals aren't welcome, but this isn't a language proposal but an idea on how to more efficiently manage the hundreds of bug reports so I hope it's okay. If this is unhelpful just close it, I won't lose sleep over it :)

Motivation

I'm an obsessive lurker. I have read hundreds of Zig issues over the past months, mainly to learn more about the language and the open questions. Doing this I noticed a repeating pattern: Bug reports with unclear status. Mostly this means the bug was originally filed including a reproduction, sometimes a short discussion was had (sometimes coming up with a simpler reproduction) and then it went stale. While nobody did anything wrong this still leads to two problems:

  1. It is unclear whether a bug still exists.
  2. The reproduction is outdated and requires rewriting for the newest zig version.

What I have occasionally observed is other users putting in the work to upgrade the repro and run it, report that it now works as expected, just for that work to also lose its value in the battle against time and Zig's unstoppable progress. Because in order to close the issue, the code still has to be upgraded and tested again before the issue can be closed. Sometimes this also causes fixed bugs to be missed and then regressed again.

While I don't have a good idea how to deal with the ~1,700 currently open Bug Reports other than just taking the work head on and figuring out whether it can even still be reproduced before working on it, I do think there is a better way to deal with future ones. At the current rate I'd expect about 500 new Bug Reports next year that'll not be solved within their release cycle (i.e go somewhat stale) so I'm thinking any investment into dealing with them more efficiently will pay off quickly.

Idea

If each Bug was tracked by a test, then:

  1. The reproduction would be kept up-to-date with the rest of the repository.
  2. It could be detected automatically when the bug gets fixed unintentionally (as a side effect of differently motivated code changes) and the issue could be closed.
  3. No new test has to be written once it is fixed, the existing test can simply be moved to become a regular test.
  4. Unknown duplicates no longer cause unnecessary reproduction work, because all tests failing due to the same underlying error would get fixed at once and allow noticing and closing the previously unknown duplicates.

Rough protocol

In my eyes Zig has two big advantages here:

  1. Zig users are largely competent, helpful, and willing to put in a bit of work to file good bug reports.
  2. Zig build's test infrastructure is quite powerful, yet relatively simple to write tests for.

I think it's possible to make use of these to solve the problem. What I'm envisioning is that Bug Reports get an improved template that guides a reporter through the process of writing a test that completely encompasses the bug. Most do this in some form already anyway, there is just no clearly defined format for reproductions yet. A team member then only needs to verify whether the behavior the user expects is actually the expected behavior. If not, discussion ensues. Up to this point this is essentially status quo. If it looks good, the team member moves the test to zig/test/reproductions (potentially renaming it slightly). Done. The issue is now tracked by a test.

In my head the additional work is low. For reporters this may actually help a little bit in that it is more clear what format the report and reproduction should have. Most reporters put work into isolating the bug and writing a small reproduction anyway, so I don't see this changing anything effort-wise. For maintainers the additional work is only committing the test case (and maybe slight changes to it) which I'm hoping can be done in the ballpark of 2 minutes.

Implementation

I'm hoping this section can be properly rewritten by smarter people if this is accepted, for now I'll just fill it with some thoughts:

  • The test harness needs to be extended to allow reproduction tests (residing in their own folder) to fail. Alternatively the problem could also be posed in reverse, asking reporters to create a test that has the reproduction and declares the unexpected output via the comments (see test case readme). This may be simpler for reporters since that would boil down to posting the code (as is status quo) and the unexpected output formatted into the test comment format. But it would be a little weird and maybe confusing to have bugs represented by passing tests and failing tests being a good thing. It would also have the problem that a failing test would not necessarily mean that the bug is fixed, it could simply be failing in a different way than before. But maybe bugs cannot generally be represented by intended behavior but sometimes only by definitely unintended behavior? Especially for code that should fail it may not be entirely clear what the expected output would be (maybe the error message doesn't exist yet?). Just leaving this here, maybe it's worth a second thought by people more familiar with the matter.
  • I'm unsure how exactly it works at the moment but I think most tests are run on all (primary) targets? Well, many bugs only exist for a specific platform so maybe they should only be run on that specific platform? Or maybe it's okay if they're run on all of them but it's only considered fixed by the test harness once it succeeds on all targets?
  • To minimize work for maintainers even more, it could be made a requirement to open a pull request with the test case before or immediately after opening the issue. The problem I see here is that that may make the hurdle to report a bug a little too high.
  • Other than regular tests, reproduction tests should definitely contain the issue they are tracking. Doesn't have to be in the title, but maybe as a comment. This could also be done relatively easily by the reporter (although it would require an edit since the issue number is unknown before posting) but it could probably also be done by a bot that scans a bug report for a reproduction with a placeholder comment such as //issue and replaces it with the correct number.
  • The test harness should report when reproduction tests suddenly succeed including the issue numbers. This then allows the developer to move the tests from test/reproductions to test/cases and add the issues to the commit message as closed. The issues could then be closed as usual when merging the PR.
  • Maybe a command could be added to the build script to move the tests automatically and emit the corresponding issue numbers (again) to make this even more straightforward.
  • Thinking of all currently existing bug reports, maybe a new issue label could be added to the repo that marks bugs that are tracked by a test. This way the well-meant work I mentioned in the introduction of rewriting reproductions and verifying them would no longer have to be wasted and could be done via a pull request, subsequently also giving the previously stale issue the "tracked-by-test" label.

Drawbacks

  • The advantage of reproductions staying up-to-date obviously comes at a cost. They have to be upgraded along the rest of the repository when Zig changes. I think this work has to be done anyway, and this way it is just cheaper but maybe I'm wrong?
  • Depending on how this is implemented it could lead to fewer Bug Reports. But maybe this is even a good thing if the quality goes up with it?
  • More tests so testing takes longer. I don't know how much of an effect this has but again I think these tests should exist anyway, no? Also they could maybe be skipped by default and are then run by the CI or even just before a release to clean them up all at once.
  • The biggest limitation is probably that this may not work for all bugs. Some can maybe only be reproduced in a specific environment and it is currently (or generally?) not possible to recreate that within the test harness? It should still be possible for users to report those bugs.

Zirunis avatar Nov 15 '25 05:11 Zirunis

Would it be wise to provide a "test" tag, such that it can immediately be conveyed that a particular issue is equipped with a working test?

tn-lorenz avatar Nov 20 '25 08:11 tn-lorenz

I think this is an interesting idea and worth exploring. However at the moment the Zig core team appears to be understaffed, with the number of untriaged issues and open PRs and continuously rising. Under these circumstances PRs adding reproduction tests will go stale before they get merged and now we're back to square one 🙂

linusg avatar Nov 20 '25 11:11 linusg