TSC icon indicating copy to clipboard operation
TSC copied to clipboard

Draft Statement of Work - Test reliability lead

Open mhdawson opened this issue 1 year ago • 22 comments

The test flakiness lead will be expected to:

  • lead a test reliability strategic initiative, rallying and supporting contributors who work to reduce flaky tests. This might include running regular test team meetings, documentation, tools, or whatever strategy works to achieve more than they can do on their own
  • build tools and improve automation that allows the project to effectively manage flaky tests to reduce their impact on the CI
  • Investigate and fix existing tests being marked as flaky in the status files

Duration

  • 6 months

Success looks like

  • maintain good test coverage
  • reduced number of tests marked as flaky in the status files
  • Running node-test-commit on the main branch will pass more often (hopefully always).
  • critical mass of contributors/collaborators dedicating time to addressing flaky tests that persists beyond the strategic initiative

mhdawson avatar Oct 02 '24 16:10 mhdawson

@nodejs/tsc as discussed in the TSC meeting today a first cut at what a statement of work for a test flakiness lead might look like.

mhdawson avatar Oct 02 '24 16:10 mhdawson

Adding to agenda so that we review/get feedback in a future meeting.

mhdawson avatar Oct 07 '24 19:10 mhdawson

I think we should explicitly add:

Investigate and fix existing tests being marked as flaky in the status files

Success looks like ...reduced number of tests marked as flaky in the status files

Otherwise this might just optimize towards marking all the tests as flaky and let them rot in the status files, which isn't ideal.

joyeecheung avatar Oct 07 '24 19:10 joyeecheung

@joyeecheung updated, thanks for the suggestion.

mhdawson avatar Oct 08 '24 15:10 mhdawson

I think we should have a more measurable success criteria, such as:

  • a list of tests that should be fixed by the individual (mandatory list)
  • number of flaky tests fixed (they can decide)

I would structure the agreement as:

  • xxx amount at start
  • yyy amount at 50% completion
  • zzz amount at finish

Alternatively, if the person is a member of the TSC, I would put at:

daily rate of XXX * number of days worked, capped at zzz.

mcollina avatar Oct 24 '24 14:10 mcollina

That would be assuming the list of tests doesn't change while this is happening, which is unlikely to be true. Other contributors can always alter the tests as necessary or mark tests as flaky as they see fit, or add more tests while all these are happening, and could use some eyes watching the status of the new tests or otherwise new flakes still come up and we won't be much better off. What we care about is whether the overall situation improves, while the situation isn't always static without the hired individual doing anything.

Also, identifying this list is also non-trivial work, especially when it could be challenging to triage and identify a correct list. It wouldn't be too meaningful if the list contains a lot of false positives, yet eliminating false positives can already be difficult enough.

If we want a quantitative measurement, then I think the rate of a passing node-test-commit CI on the main branch is already enough (which has been around 0% for some time).

joyeecheung avatar Oct 24 '24 15:10 joyeecheung

That would be assuming the list of tests doesn't change while this is happening, which is unlikely to be true. I agree with that.

I also think we want somebody who will do more than just fix specific tests, helping to improve how we manage and resolve flaky tests though automation and tools is just as important as fixing specific flaky tests.

mhdawson avatar Oct 28 '24 14:10 mhdawson

I've not been privy to this conversation so I might be missing the mark with this comment, but I think it will be more understandable and more professional-sounding if you call it a "test reliability lead" rather than a "test flakiness lead". In a formal/professional document, I wouldn't refer to "flakiness" but instead refer to "reliability" (or "unreliability").

Trott avatar Nov 12 '24 04:11 Trott

@Trott like that suggestion, incorporated.

mhdawson avatar Nov 12 '24 19:11 mhdawson

Sent email today to the Foundation executive directory with @mcollina as our CPC rep on CC to ask what might be possible in terms of funding from the OpenJS foundation.

mhdawson avatar Nov 18 '24 23:11 mhdawson

@mhdawson (or @mcollina) seeing how half a year has passed I'm guessing the answer is 'no' but did you guys ever receive a reply?

bnoordhuis avatar Apr 16 '25 10:04 bnoordhuis

The answer is that there is no money for this right now. We are trying to assess if we can find the funding to make it happen. Specifically, the amount of money that we will receive from HeroDevs. Having said that, most of the funds are spent on keeping the lights up of the Jenkins CI.

mcollina avatar Apr 16 '25 12:04 mcollina

Okay, thanks. Those keeping-the-lights-on funds you mention, are they coming from the $2.9m mentioned in #1687 or from somewhere else?

bnoordhuis avatar Apr 17 '25 07:04 bnoordhuis

You have misread that 2.9m figure. They are revenue, not profits. I recommend that you jump to https://youtu.be/Yq2hEseP-Ck?si=eNGFA1AR9_RpkIei&t=470 as all the costs are explained there.

Note that the budget was also revised down, as we missed one of our grants and a member reduced their level. So, we will be tipping into the reserves this year.

mcollina avatar Apr 17 '25 08:04 mcollina

Can these expenses be adjusted to allow more money for Node.js? Spending such big chunks of the revenue on other things while node has no money feels off. The biggest part of the revenue should be spent on the development of Node.js, like this work.

RaisinTen avatar May 04 '25 12:05 RaisinTen

That's revenue, not profit. That'd be fantastic if sustainable, but it's not. Almost all that projected revenue is tied into:

  1. events
  2. grants with specific focus (Node.js got 150k in 2025)

The Foundation is also doing its best to help with build/infra (specifically Macs).

mcollina avatar May 04 '25 13:05 mcollina

... Can these expenses be adjusted to allow more money for Node.js?

We have no oversight or say in what funds the Foundation may or may not make available to the project. We can ask if they are able to help with something and if they say they can't then they can't. Any feedback back to the foundation on that is correctly directed through the CPC.

... the biggest part of the revenue should be spent on the development of Node.js

Node.js is not the foundations only project. That said, we already get the majority of focus and an overwhelming majority of the available resources. The foundation is doing what it can. What would be most helpful in this conversation, for those of you wishing there was more money available for Node.js to use to fund various efforts, would be to encourage new member organizations to join the foundation (bringing in new member revenue), existing members to increase their investment (e.g. silvers change to gold), or by helping to promote services such as the extended support program.

jasnell avatar Jun 24 '25 03:06 jasnell

What would be most helpful in this conversation, for those of you wishing there was more money available for Node.js to use to fund various efforts, would be to encourage new member organizations to join the foundation (bringing in new member revenue), existing members to increase their investment (e.g. silvers change to gold), or by helping to promote services such as the extended support program.

I think this request is targeting a very ineffective audience - the same group of people who are capable of doing that work if they have the time. For example, if it takes more than 2 hour per week to reach out to companies to send money to the foundation in the hope that a fraction of it may be spent on fixing CI flakes in maybe a year (which is not guaranteed), it would be way more effective for us to just spend that 2 hour per week to actually work on CI flakes ourselves (which is immediate and guaranteed work). The efficiency in this model would be in negative correlation to how capable the audience is in doing that work themselves.

(I have a vague impression that the collective hours in the back and forth of trying to get money to work on this so far already exceeded what would be needed to bring the CI green rate up significantly if we just used those hours to fix CI flakes ourselves instead.)

joyeecheung avatar Jun 24 '25 15:06 joyeecheung

There was already an attempt to document a way to allow for more funds for Node.js in https://github.com/nodejs/admin/pull/955 but the Foundation said no with unclear reasoning, so it's been blocked with no observable progress for nearly 4 months.

RaisinTen avatar Jun 25 '25 04:06 RaisinTen

(I have a vague impression that the collective hours in the back and forth of trying to get money to work on this so far already exceeded what would be needed to bring the CI green rate up significantly if we just used those hours to fix CI flakes ourselves instead.)

Then let's just close this and not worry about asking the Foundation to pay for anything here. I see no reason to keep this issue open.

jasnell avatar Jun 25 '25 05:06 jasnell

reopening this per TSC discussion on 13th Aug to revisit the draft

gireeshpunathil avatar Aug 14 '25 01:08 gireeshpunathil

Swapping the TSC agenda label of https://github.com/nodejs/TSC/issues/1614 with this one because this one is more actionable

joyeecheung avatar Oct 22 '25 10:10 joyeecheung