labeler icon indicating copy to clipboard operation
labeler copied to clipboard

Fix: Preserve manually added labels during workflow run and refine label sync logic

Open chiranjib-swain opened this issue 1 month ago • 1 comments

Description: This pull request updates the labeler function to improve how labels are managed and applied to pull requests, especially when labels may have been added manually or by other bots during the workflow run.

Label management improvements:

  • The label application logic now fetches the latest labels from the pull request before applying new ones, so it can detect labels that were manually added or added by other bots during the workflow run. It merges these with the labels determined by the config, deduplicates them, and enforces the maximum label limit (GITHUB_MAX_LABELS).
  • The output for all-labels now reflects the final set of labels actually applied to the pull request, ensuring outputs are accurate and up-to-date..

Related issue: (https://github.com/actions/labeler/issues/908).

Check list:

  • [ ] Mark if documentation changes are required.
  • [ ] Mark if tests were added or updated to cover the changes.

chiranjib-swain avatar Nov 24 '25 05:11 chiranjib-swain

Hi @jnewland , Thank you for sharing your thoughts on the label management strategy.

We understand the familiarity with the earlier POST + DELETE approach. After revisiting the long-standing reliability challenges documented in PR #497, we’ve chosen to continue with the setLabels (PUT-based bulk update) method because it offers a more stable and predictable foundation for large or high-activity repositories.

Why we rely on setLabels (PUT)

The previous incremental approach worked for smaller workflows, but at scale it introduced several issues:

  1. High API usage: DELETE removes one label at a time, often triggering rate-limit exhaustion.
  2. Inconsistent retries: The non-atomic POST failed with a 502 error after applying a partial set of over 50 labels. Subsequent retries failed with a 422 error ("more than 100 labels") as the new labels combined with the partially applied ones exceeded the limit.
  3. Unreliable under load: A high volume of PR labels (50+) or concurrent workflow runs increased 5xx errors and incomplete updates due to the nature of POST/DELETE operations.

These problems were significant enough that PR #497 replaced POST/DELETE with a single PUT-based update.

Why PUT is more reliable

  1. Safe to retry (idempotent)
  2. Updates all labels in one atomic call
  3. Enforces GitHub’s 100-label limit before sending the request.
  4. Replaces dozens of DELETE calls with a single operation

This aligns better with the concurrency patterns common in GitHub Actions.

How PR #917 improves this further

PR 917 narrows the read–write race window, preserves manually added labels more consistently, and improves workflow stability when updates happen mid-run.

We truly appreciate your feedback and are open to further discussion on improving label synchronization within the constraints of GitHub’s API.

chiranjib-swain avatar Dec 16 '25 05:12 chiranjib-swain

Thanks for the update. I had forgotten that the label delete API only allows one label to be removed at a time; that's disappointing.

jnewland avatar Dec 17 '25 20:12 jnewland