TSC icon indicating copy to clipboard operation
TSC copied to clipboard

Proposal to reduce inactive collaborator duration

Open anonrig opened this issue 1 year ago • 4 comments

The current security incidents around Linux made me realize that we should look into Node.js organization from a different perspective.

The current inactive collaborator duration is 18 months (1.5 years). Collaborators have write access to the repository, access to collaborator private repository and can run CI at any time (and run any code on our infrastructure).

The requirement to keep collaborator status is:

  1. Approve a pull-request that landed to main branch
  2. Author and land a pull-request to main branch at least once in 18 months

I'd like to reduce the inactive collaborator duration to 6 months or 12 months.

My reasonings are:

  • People who have not contributed or reviewed or landed a pull-request for more than 6 months has a high chance to not do it for the remainder of 18 months. Having them as "dormant" creates a security risk.
  • Due to the complexity of Node.js and the quantity of commits we land every year (around 2300~ commits in 2023 if I'm not mistaken), it's doesn't seem possible for a collaborator to review a pull-request even though they have not contributed for a really long time, and the validity of that review (or block) might or might not be correct.
  • Even though we have Slack channels that we can track/monitor of edge cases like force-pushes, landing pull-requests without changes, it's possible that a security incident might occur, in an area we didn't think of. From a security perspective each member of Node.js organization is a risk similar to what's happening right now in Linux ecosystem. This might be harsh to hear, but in reality we all know in our heart that having access to CI (through request-ci or custom tasks) pose serious security risk.
  • Throughout my Node.js contributing experience (referencing the past 2-3 years), I've seen contributors who never contributed block certain pull-requests that frustrate Node.js contributors, and even resulted in several resignations.

My logic might be flawed, and 6 months or 12 months might not be the correct duration, but I think we have the obligation (to the users of Node.js) to think about collaborator membership and security of it soon.

I believe that we should at least discuss/think about this in the short term.

cc @nodejs/tsc

anonrig avatar Mar 30 '24 19:03 anonrig

Back in 2018 at one collaboration summit I proposed to revoke the commit bit if it has not been used to commit anything in 3 months: https://github.com/joyeecheung/talks/blob/master/node_collab_summit_201805/core_collaboartors_status_and_scope.pdf (when I investigated in 2018, ~50% of the collaborators would be affected if we did that). Collaborators could request to go back if they started sending PR again.

Didn't follow up because there were some pushbacks.

joyeecheung avatar Mar 30 '24 21:03 joyeecheung

Collaborators have write access to the repository, access to collaborator private repository and can run CI at any time (and run any code on our infrastructure).

As a counter point I don't think this was ever abused. Releasing and security issues which are the riskier bits both require different steps from being a collaborator (e.g. release team, tsc) that have shorter inactivity deadlines and removing people doesn't buy us too much as a result.

Before we do this, I want to make sure it doesn't alienate people with periods of inactivity - can we run a query to check what active collaborators had a period of inactivity and returned from such a period? (Not sure how hard it is to check since a bunch of people mostly review)

benjamingr avatar Mar 31 '24 17:03 benjamingr

+1 on changing this. maybe we can make re-nomination easier. for example, requiring more than a single collaborator to block it

based on https://github.com/nodejs/node/blob/main/tools/find-inactive-collaborators.mjs if we change to 12 months of inactivity:

Inactive collaborators (10):

* Anto Aravinth
* Ash Cripps
* Bradley Farias
* Daniel Bevenius
* Gus Caplan
* Ian Sutherland
* Brian White
* Ricky Zhou
* Ujjwal Sharma
* Masashi Hirano

if we change to months of inactivity:

Inactive collaborators (19):

* Anto Aravinth
* Ash Cripps
* Bradley Farias
* Сковорода Никита Андреевич
* Daniel Bevenius
* Gus Caplan
* Feng Yu
* Ian Sutherland
* Yoshiki Kurihara
* Mestery
* Alba Mendez
* Brian White
* Ricky Zhou
* Ujjwal Sharma
* Masashi Hirano
* Tiancheng "Timothy" Gu
* Vladimir de Turckheim
* Khaidi Chu
* Yash Ladha

Note Ian Sutherland has already passed the 18 months inactivity period: https://github.com/nodejs/node/pull/52300

MoLow avatar Apr 01 '24 05:04 MoLow

I don't think reducing the inactivity period will reduce the risk of a rogue collaborators.

We should document how we are doing this (code reviews, release reviews, minimum 2 weeks before a commit goes to a LTS release, a tight group of releasers).

mcollina avatar Apr 01 '24 07:04 mcollina

I don't think reducing the inactivity period will reduce the risk of a rogue collaborators.

Strongly Agree

Collaborators have write access to the repository, access to collaborator private repository and can run CI at any time (and run any code on our infrastructure).

While write access grants certain privileges, there are other control measures in place that limit the capabilities for hostile operations.

Firstly, there are two CI systems (https://ci.nodejs.org/ and https://ci-release.nodejs.org/) with different Project-based Matrix Authorization Strategies.

In the ci, collaborators are limited to the following:

image

Reference

While in the ci-release, collaborators are not allowed at all, as it is limited to Jenkins administrators (sub-group) and releasers.

On the other hand, even if a malicious collaborator is capable of landing commits abusing write access, there are certain ways to detect it, like the OSSF Scorecard already in place that the Security WG reviews every two weeks (example). But I strongly believe that before that, any TC or collaborator will detect that "unexpected change" in the main branch or some alerts in Slack will bring our attention.

Moreover, even if this change goes unnoticed, the commit won't be released automatically to the Node.js users. As with any release in Node.js, we are required to follow these 20 steps. As part of these steps, the releasers will first cherry-pick the commits based on certain criteria and then elaborate a PR that anyone can review (see). These PRs tend to be open for days and are quite well-reviewed and commented by the collaborators.

My two cents, as @mcollina said, the key here is to document and review these protections and processes that we have already in place to prevent bad things (accidental stuff, human errors, malicious actors...) 😊

UlisesGascon avatar Apr 08 '24 14:04 UlisesGascon

the benefit of this suggestion might not be huge, but if it is going to affect ±20 people there is not a big downside either

MoLow avatar Apr 08 '24 14:04 MoLow

One of the hopes of the commit queue was to perhaps reduce the need to grant so many people write access to the repository.

richardlau avatar Apr 08 '24 14:04 richardlau

@UlisesGascon thanks for providing the outline of some of the things that we do that reduce the risk of a rogue collaborator.

If we do make a change I think 6 months is too short but would be ok with 1 year for inactivity.

mhdawson avatar Apr 08 '24 15:04 mhdawson

If we do make a change I think 6 months is too short but would be ok with 1 year for inactivity.

@mhdawson Why do you think a year is too short? Can you elaborate?

anonrig avatar Apr 08 '24 15:04 anonrig

While write access grants certain privileges, there are other control measures in place that limit the capabilities for hostile operations.

@UlisesGascon Having write access (meaning executing/running CI jobs) gives users to run any arbitrary code on the proposed infrastructure. We also talked about non-collaborators being/not being on the release team for example due to similar concerns. So, I don't quite understand how write access can be preventable with other control measures.

anonrig avatar Apr 08 '24 15:04 anonrig

@anonrig I can easily see people taking a break (for example to care for a new child etc.) who will come back getting removed by the 6 month check. A 12 month check makes this a lot less likely in my opinion.

mhdawson avatar Apr 08 '24 16:04 mhdawson

If as @mcollina mentioned that this measure doesn't address the security issues mentioned in the motivation, what's the remainder motivation of the change?

legendecas avatar Apr 24 '24 15:04 legendecas

If as @mcollina mentioned that this measure doesn't address the security issues mentioned in the motivation, what's the remainder motivation of the change?

Reducing the number of collaborators as a mean to better communicate to the public the number of people actively maintaining the project, I guess?

benjamingr avatar Apr 24 '24 15:04 benjamingr

I went ahead a run an analysis on Node.js codebase. Using the 8232 commits (that's 20% of all the commits that have landed on main) that landed at least one year ago, counting contributions for commit authors, co-author, and reviewer, here are the results:

{
  "oneTimeContributors": 517,
  "sampleSize": 324,

  "25-percentile": "3 weeks",
  "median": "11 weeks",
  "75-percentile": "6 months",
  "85-percentile": "9.5 months",
  "90-percentile": "12.9 months",
  "99-percentile": "22.5 months",

  "max": "28.9 months"
}

N.B.: this includes all returning contributors, not just collaborators. The vast majority of contributors contribute only once to the project.

For 90% of contributors, 56 weeks (which is just a bit more than 12 months) is their longest period on inactivity before coming back to the project. For 75% of contributors, 26 weeks (which is about 6 months) is their longest period on inactivity before coming back to the project 50% of contributors who have at least contributions and are inactive for more than 11 weeks won't come back.

Sorry if the presentation of results is not ideal. Someone who knows R, please make a graph 🙈

Code
#!/usr/bin/env node

// Identify inactive collaborators. "Inactive" is not quite right, as the things
// this checks for are not the entirety of collaborator activities. Still, it is
// a pretty good proxy. Feel free to suggest or implement further metrics.

import cp from 'node:child_process';
import fs from 'node:fs';
import readline from 'node:readline';

const cacheIdentities = new Map();
const tableOfContributions = { __proto__: null };

async function resolveIdentity(identity) {
  const childProcess = cp.spawn('git', ['check-mailmap', identity], {
    cwd: new URL('..', import.meta.url),
    encoding: 'utf8',
    stdio: ['inherit', 'pipe', 'inherit'],
  });

  return (await childProcess.stdout.toArray()).join('').trim();
}

async function runGitCommand() {
  const childProcess = cp.spawn('git', ['log', '-8232', '--before=1 year ago', '--pretty=format:{"hash":"%H","date":%ct,"actors":["%an <%ae>","%(trailers:only,valueonly,key=Co-authored-by,separator="%x2C")","%(trailers:only,valueonly,key=Reviewed-by,separator="%x2C")"]}'], {
    cwd: new URL('..', import.meta.url),
    encoding: 'utf8',
    stdio: ['inherit', 'pipe', 'inherit'],
  });
  const lines = readline.createInterface({
    input: childProcess.stdout,
  });
  const errorHandler = new Promise(
    (_, reject) => childProcess.on('error', reject),
  );
  await Promise.race([errorHandler, Promise.resolve()]);
  // If no mapFn, return the value. If there is a mapFn, use it to make a Set to
  // return.
  const allPromises = [];
  for await (const line of lines) {
    await Promise.race([errorHandler, Promise.resolve()]);
    const { date, actors } = JSON.parse(line.replace(/ "([a-zA-Z]+)" /, ' \\u0022$1\\u0022 '));
    for (const actor of actors) {
      if (actor) {
        let actualIdentity = cacheIdentities.get(actor);
        if (actualIdentity == null) {
          actualIdentity = resolveIdentity(actor);
          cacheIdentities.set(actor, actualIdentity);
        }
        allPromises.push(actualIdentity.then((identity) => {
          if (identity) {
            (tableOfContributions[identity] ??= []).push(date);
          } else {
            console.error({ line });
          }
        }));
      }
    }
  }
  return Promise.race([errorHandler, Promise.all(allPromises)]);
}

await runGitCommand();

let oneTimeContributors = 0;
let sampleSize = 0;
const maxDelaysBetweenContributions = [];

const sortNumbers = (a, b) => a - b;

for (const contributor in tableOfContributions) {
  const contributions = tableOfContributions[contributor].sort(sortNumbers);
  if (contributions.length === 1) {
    oneTimeContributors++;
    delete tableOfContributions[contributor];
    continue;
  }
  sampleSize++;
  let max = Number.MIN_SAFE_INTEGER;
  for (let i = 1; i < contributions.length; i++) {
    const diff = contributions[i] - contributions[i - 1];
    if (diff > max) max = diff;
  }
  maxDelaysBetweenContributions.push(max);
}

maxDelaysBetweenContributions.sort(sortNumbers);

const percentile = (percentile) => Math.ceil((percentile / 100) * maxDelaysBetweenContributions.length) - 1;

const secondsToDays = (seconds) => seconds / 3600 / 24;

const formatTime = (seconds) => {
  const days = secondsToDays(seconds);

  if (days < 7) {
    return `${Math.round(days)} days`;
  }
  if (days < 150) {
    return `${Math.round(days / 7)} weeks`;
  }

  return `${Math.round(days / 36.525 * 12) / 10} months`;

};

console.log(JSON.stringify({
  oneTimeContributors,
  sampleSize,

  '25-percentile': formatTime(maxDelaysBetweenContributions[percentile(25)]),
  'median': formatTime(maxDelaysBetweenContributions[percentile(50)]),
  '75-percentile': formatTime(maxDelaysBetweenContributions[percentile(75)]),
  '85-percentile': formatTime(maxDelaysBetweenContributions[percentile(85)]),
  '90-percentile': formatTime(maxDelaysBetweenContributions[percentile(90)]),
  '99-percentile': formatTime(maxDelaysBetweenContributions[percentile(99)]),
  'max': formatTime(maxDelaysBetweenContributions.at(-1)),
}, null, 2));
Results if I run the analysis on the last 20 000 commits that have landed on main
{
  "oneTimeContributors": 1423,
  "sampleSize": 690,

  "25-percentile": "2 weeks",
  "median": "12 weeks",
  "75-percentile": "8.7 months",
  "85-percentile": "14.9 months",
  "90-percentile": "18.3 months",
  "99-percentile": "41.7 months",

  "max": "72.5 months"
}

aduh95 avatar Apr 24 '24 17:04 aduh95

For 75% of contributors, 26 weeks (which is about 6 months) is their longest period on inactivity before coming back to the project

This (25% of collaborators inactive for 26 weeks returning, and 10% after 56 weeks) aligns with my intuition and strengthens my concern this will alienate a significant amount of contributions/reviews in the project.

benjamingr avatar Apr 24 '24 18:04 benjamingr

This (25% of collaborators inactive for 26 weeks returning, and 10% after 56 weeks)

Note that this is contributors, not collaborators (which may or may not be a close enough proxy, not sure). Filtering collaborators would be bit harder, but certainly possible if someone is motivated to adapt the script.

aduh95 avatar Apr 24 '24 18:04 aduh95

I've re-run the analysis, with different parameters:

  • 90% of Node.js contributors have less than 9 contributions. I decided to look into those who have more than 9 contributions (I expect all 87 collaborators and 120 emeriti to be in that last 10%). I also generated the result if we take 68 as the contribution threshold (68 because only 5% on Node.js contributors have 68 or more contributions).
  • Sometimes the maximum delay is not that informative, because it is by definition an extreme case (e.g. in my case, I made my first contribution in 2018, and didn't come back until much later. That first period of inactivity is not relevant imo). Instead, I'm looking at the different percentile of delay between contributions per contributor.
  • I'm taking into account the last 20k commits that landed on main (half commits of the lifetime of the project).
{
  "contributionThreshold": 9,
  "skippedContributors": 1896,
  "sampleSize": 217
}
┌─────────────────┬─────────────────┬─────────────────┬─────────────────┬───────────────┐
│     (index)     │ 90th-pencentile │ 95th-pencentile │ 99th-pencentile │      max      │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┼───────────────┤
│ 25th-pencentile │    '7 days'     │    '2 weeks'    │    '7 weeks'    │  '13 weeks'   │
│     median      │    '4 weeks'    │    '8 weeks'    │   '17 weeks'    │ '5.9 months'  │
│ 75th-pencentile │   '13 weeks'    │  '5.2 months'   │  '10.4 months'  │  '13 months'  │
│ 80th-pencentile │   '16 weeks'    │  '6.5 months'   │   '13 months'   │ '15.1 months' │
│ 85th-pencentile │  '5.1 months'   │  '10.1 months'  │  '15.2 months'  │ '16.7 months' │
│ 90th-pencentile │  '6.7 months'   │  '13.2 months'  │  '18.9 months'  │ '18.9 months' │
│ 95th-pencentile │   '12 months'   │  '21.4 months'  │  '23.7 months'  │ '23.7 months' │
│ 99th-pencentile │   '22 months'   │   '40 months'   │  '43.8 months'  │ '43.8 months' │
│       max       │  '43.8 months'  │  '45.1 months'  │  '58.5 months'  │ '58.5 months' │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┴───────────────┘

{
  "contributionThreshold": 68,
  "skippedContributors": 2007,
  "sampleSize": 106
}
┌─────────────────┬─────────────────┬─────────────────┬─────────────────┬───────────────┐
│     (index)     │ 90th-pencentile │ 95th-pencentile │ 99th-pencentile │      max      │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┼───────────────┤
│ 25th-pencentile │    '3 days'     │    '6 days'     │    '3 weeks'    │   '9 weeks'   │
│     median      │    '10 days'    │    '2 weeks'    │    '8 weeks'    │  '18 weeks'   │
│ 75th-pencentile │    '3 weeks'    │    '5 weeks'    │   '16 weeks'    │ '8.7 months'  │
│ 80th-pencentile │    '3 weeks'    │    '6 weeks'    │   '18 weeks'    │ '11.1 months' │
│ 85th-pencentile │    '4 weeks'    │    '8 weeks'    │  '5.2 months'   │ '14.5 months' │
│ 90th-pencentile │    '5 weeks'    │    '9 weeks'    │  '6.1 months'   │ '16.3 months' │
│ 95th-pencentile │    '5 weeks'    │   '12 weeks'    │  '14.5 months'  │  '18 months'  │
│ 99th-pencentile │    '8 weeks'    │   '15 weeks'    │  '18.1 months'  │ '18.3 months' │
│       max       │    '9 weeks'    │   '16 weeks'    │  '26.1 months'  │ '26.1 months' │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┴───────────────┘

The results are more or less consistent with what I've found earlier, except for the 99th percentile but I think that can be explained simply by the fact that I looked at more commits. Well it took probably too long to came to the same conclusion, but at least I feel more confident those figures corresponds to the reality. Hopefully I haven't introduced any bias in the way I'm selecting the data.

Code
#!/usr/bin/env node

import cp from 'node:child_process';
import readline from 'node:readline';

const cacheIdentities = new Map();
const tableOfContributions = { __proto__: null };

async function resolveIdentity(identity) {
  const childProcess = cp.spawn('git', ['check-mailmap', identity], {
    cwd: new URL('..', import.meta.url),
    encoding: 'utf8',
    stdio: ['inherit', 'pipe', 'inherit'],
  });

  return (await childProcess.stdout.toArray()).join('').trim();
}

async function runGitCommand() {
  const childProcess = cp.spawn('git', ['log', '-20000', '--pretty=format:{"hash":"%H","date":%ct,"actors":["%an <%ae>","%(trailers:only,valueonly,key=Co-authored-by,separator="%x2C")","%(trailers:only,valueonly,key=Reviewed-by,separator="%x2C")"]}'], {
    cwd: new URL('..', import.meta.url),
    encoding: 'utf8',
    stdio: ['inherit', 'pipe', 'inherit'],
  });
  const lines = readline.createInterface({
    input: childProcess.stdout,
  });
  const errorHandler = new Promise(
    (_, reject) => childProcess.on('error', reject),
  );
  await Promise.race([errorHandler, Promise.resolve()]);
  // If no mapFn, return the value. If there is a mapFn, use it to make a Set to
  // return.
  const allPromises = [];
  for await (const line of lines) {
    await Promise.race([errorHandler, Promise.resolve()]);
    const { date, actors } = JSON.parse(line.replace(/ "([a-zA-Z]+)" /, ' \\u0022$1\\u0022 '));
    for (const actor of actors) {
      if (actor) {
        let actualIdentity = cacheIdentities.get(actor);
        if (actualIdentity == null) {
          actualIdentity = resolveIdentity(actor);
          cacheIdentities.set(actor, actualIdentity);
        }
        allPromises.push(actualIdentity.then((identity) => {
          if (identity) {
            (tableOfContributions[identity] ??= []).push(date);
          } else {
            console.error({ line });
          }
        }));
      }
    }
  }
  return Promise.race([errorHandler, Promise.all(allPromises)]);
}
const percentile = (array, percentile) => array[Math.ceil((percentile / 100) * array.length) - 1];

await runGitCommand();

let skippedContributors = 0;
let sampleSize = 0;
const maxDelaysBetweenContributions = {
  __proto__: null,
  90: [],
  95: [],
  99: [],
  100: [],
};

const contributionThreshold = 68;

const sortNumbers = (a, b) => a - b;

// Const countContrib = Object.values(tableOfContributions).map((c) => c.length).sort(sortNumbers);

// console.log(Object.fromEntries(Array.from({ length: 99 }, (_, i) => [i + 1, percentile(countContrib, i + 1)])));

for (const contributor in tableOfContributions) {
  const contributions = tableOfContributions[contributor].sort(sortNumbers);
  if (contributions.length < contributionThreshold) {
    skippedContributors++;
    delete tableOfContributions[contributor];
    continue;
  }
  sampleSize++;
  const delayBetweenContributions = Array(contributions.length - 1);
  for (let i = 0; i < contributions.length - 1; i++) {
    delayBetweenContributions[i] = contributions[i + 1] - contributions[i];
  }
  for (const p in maxDelaysBetweenContributions) {
    maxDelaysBetweenContributions[p].push(
      p === '100' ?
        delayBetweenContributions.at(-1) :
        percentile(delayBetweenContributions.sort(sortNumbers), p),
    );
  }
}

const secondsToDays = (seconds) => seconds / 3600 / 24;

const formatTime = (seconds) => {
  const days = secondsToDays(seconds);

  if (days < 2) {
    return `${Math.round(days * 240) / 10} hours`;
  }
  if (days < 14) {
    return `${Math.round(days)} days`;
  }
  if (days < 150) {
    return `${Math.round(days / 7)} weeks`;
  }

  return `${Math.round(days / 36.525 * 12) / 10} months`;

};

const percentiles = [25, 50, 75, 80, 85, 90, 95, 99, 100];
const table = { __proto__: null };
for (const p in maxDelaysBetweenContributions) {
  const _p = p === '100' ? 'max' : p === '50' ? 'median' : `${p}th-pencentile`;
  const delays = maxDelaysBetweenContributions[p].sort(sortNumbers);
  for (const x of percentiles) {
    const _x = x === 100 ? 'max' : x === 50 ? 'median' : `${x}th-pencentile`;
    table[_x] ??= { __proto__: null };
    table[_x][_p] = formatTime(x === '100' ? delays.at(-1) : percentile(delays, x));
  }
}

console.log(JSON.stringify({
  contributionThreshold,
  skippedContributors,
  sampleSize,
}, null, 2));
console.table(table);
Percentiles of number of contributions per contributors for the last 20k commits
{
  median: 1,
  '67': 1,
  '68': 2,
  '79': 2,
  '80': 3,
  '84': 3,
  '85': 4,
  '86': 5,
  '87': 5,
  '88': 6,
  '89': 7,
  '90': 9,
  '91': 12,
  '92': 16,
  '93': 27,
  '94': 42,
  '95': 68,
  '96': 110,
  '97': 172,
  '98': 389,
  '99': 879
}

aduh95 avatar Apr 25 '24 00:04 aduh95

I’ve re-run the analysis, with different parameters:

So based on this, had the inactive duration been 12 or 9 or 6 months over the past few years, how many currently active collaborators would have been moved to emeritus?

GeoffreyBooth avatar Apr 30 '24 19:04 GeoffreyBooth