runner-images icon indicating copy to clipboard operation
runner-images copied to clipboard

Disable man-db dpkg trigger

Open lengau opened this issue 1 year ago • 4 comments

Description

This isn't necessarily a bug - more a performance enhancement request, but there's no category for that.

When running apt install commands on Ubuntu hosted runners, the runners spend about 45 seconds just processing man-db triggers:

image

However, in very few cases are these runners ever going to actually use these man-db changes, so this is close to a minute of wasted time for each run.

Adding the following commands to the setup of each runner image would disable this trigger:

echo "set man-db/auto-update false" | debconf-communicate
dpkg-reconfigure man-db

Platforms affected

  • [ ] Azure DevOps
  • [X] GitHub Actions - Standard Runners
  • [X] GitHub Actions - Larger Runners

Runner images affected

  • [X] Ubuntu 20.04
  • [X] Ubuntu 22.04
  • [X] Ubuntu 24.04
  • [ ] macOS 12
  • [ ] macOS 13
  • [ ] macOS 13 Arm64
  • [ ] macOS 14
  • [ ] macOS 14 Arm64
  • [ ] macOS 15
  • [ ] macOS 15 Arm64
  • [ ] Windows Server 2019
  • [ ] Windows Server 2022

Image version and build link

Fri, 15 Nov 2024 17:03:46 GMT
Runner Image
Fri, 15 Nov 2024 17:03:46 GMT   Image: ubuntu-24.04
Fri, 15 Nov 2024 17:03:46 GMT   Version: 20241112.1.0
Fri, 15 Nov 2024 17:03:46 GMT   Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20241112.1/images/ubuntu/Ubuntu2404-Readme.md
Fri, 15 Nov 2024 17:03:46 GMT   Image Release: https://github.com/actions/runner-images/releases/tag/ubuntu24%2F20241112.1

Relevant step: https://github.com/canonical/starflow/actions/runs/11860374887/job/33055486302?pr=23#step:2:88

Is it regression?

no

Expected behavior

man-db triggers don't get run on apt-get install

Actual behavior

man-db triggers do get run

Repro steps

Run the following workflow, note that it takes tends to timeout during the trigger processing.

name: Self-tests for scanners

on:
  push:
    branches:
      - main
  pull_request:

jobs:
  man-db:
    steps:
      - run: |
          sudo apt-get update
          timeout 30s sudo apt-get --yes install python3-build

lengau avatar Nov 15 '24 17:11 lengau

Hi @lengau , we're looking into this issue , we will update on it ASAP. thank you !

vidyasagarnimmagaddi avatar Nov 15 '24 17:11 vidyasagarnimmagaddi

Huh, so I can't reproduce this in a fresh repository (see: https://github.com/lengau/playground/actions/runs/11861145621/job/33057884286) on any Ubuntu version - it might vary depending on how much disk access is occurring in the background or something too?

lengau avatar Nov 15 '24 18:11 lengau

Preliminary investigation looks okay to waive the auto-update of man-db in the runner setup. Will update once we complete our analysis on the same.

subir0071 avatar Nov 20 '24 18:11 subir0071

Huh, so I can't reproduce this in a fresh repository (see: https://github.com/lengau/playground/actions/runs/11861145621/job/33057884286) on any Ubuntu version - it might vary depending on how much disk access is occurring in the background or something too?

Yes, this is possible. The man-db trigger triggers some amount of IO and the faster your storage the less noticeably is the slowdown.

FWIW, I could reproduce this issue today, i.e. on a github actions (public repo) job the man-db trigger ran for over 6 minutes!

From the logs:

2024-12-17T18:07:12.3693605Z Image: ubuntu-24.04
2024-12-17T18:07:12.3695069Z Version: 20241208.1.0
[..]
2024-12-17T18:07:29.1945880Z Processing triggers for man-db (2.12.0-4build2) ...
2024-12-17T18:13:55.4340099Z 
2024-12-17T18:13:55.4340814Z Running kernel seems to be up-to-date.

gsauthof avatar Dec 17 '24 20:12 gsauthof

I had the same issue. This worked for me:

echo 'set man-db/auto-update false' | sudo debconf-communicate >/dev/null
sudo dpkg-reconfigure man-db

https://askubuntu.com/a/1476024

Image

Image

0xTadash1 avatar Feb 25 '25 09:02 0xTadash1

Hi @lengau, it's true that "mans" aren't really needed in Runner, but that's a common Ubuntu functionality that other customers can use. By implementing this change, we could potentially harm many projects.

Please use the above workaround before installing the required applications, it works.

If you have any other questions feel free to reach us.

Alexey-Ayupov avatar Feb 27 '25 09:02 Alexey-Ayupov

By implementing this change, we could potentially harm many projects.

@Alexey-Ayupov How exactly? Is there any example of one of these many projects?

It seems, it is simply not true.

Totktonada avatar Feb 27 '25 09:02 Totktonada

Don't runners have their own images, though? man-db optimizations should only matter for interactive use.

mikebveil avatar Apr 01 '25 15:04 mikebveil

In my experience, I found that this:

# Skip installing pacakge docs {makes the man-db trigger much faster) 
# (I disabled `/doc` and `/info` too, just in case.)
sudo tee /etc/dpkg/dpkg.cfg.d/01_nodoc > /dev/null << 'EOF'
path-exclude /usr/share/doc/*
path-exclude /usr/share/man/*
path-exclude /usr/share/info/*
EOF

Is slightly faster than this:

echo "set man-db/auto-update false" | debconf-communicate
dpkg-reconfigure man-db

In short, disabling the man-db trigger takes longer than for the trigger to noop after noticing that no man pages were added.

sebastiancarlos avatar Apr 16 '25 20:04 sebastiancarlos

By implementing this change, we could potentially harm many projects.

By not implementing this change, you are harming millions of projects by wasting maintainers time... Making this one small change could reduce Github's entire runner cost by 1% or more. I imagine that's a saving of $100,000s or more.

Installing a single package in my project, by default, takes anywhere in the range of 12-100 seconds. With one of the above fixes in place, it takes 4-5 seconds.

Dreamsorcerer avatar May 27 '25 00:05 Dreamsorcerer

I imagine that's a saving of $100,000s or more.

Assuming projects have billing enabled, who is saving and who is losing profits? 😅

kernc avatar Aug 10 '25 15:08 kernc

The performance impact of the man-db auto-update trigger appears to be getting worse - it's come up again in:

  • https://github.com/actions/runner/issues/4030
  • https://github.com/actions/runner-images/issues/5770#issuecomment-3409530353

edmorley avatar Oct 20 '25 18:10 edmorley

I suggest everyone affected by this (which seems to be quite a few projects, judging by the ping-backs in the issue history) open a GitHub support ticket asking for this to be fixed in the runner images (I'm going to do so now -> https://support.github.com/ticket/personal/0/3851768): https://support.github.com/contact/bug-report

edmorley avatar Oct 20 '25 18:10 edmorley

We'll reconsider this decision. Disabling automatic triggers might make sense given the impact. I'll come back later with an updated solution.

erik-bershel avatar Oct 21 '25 10:10 erik-bershel

I personally want to thank Mr. Morley for his social engineering feat getting this issue re-open. Your strategic threat to bypass the issues mechanism will be studies for ages to come in our quest to shape open-source projects to the rightful will of the masses.

sebastiancarlos avatar Oct 21 '25 20:10 sebastiancarlos

Fix is deployed. 🎉

erik-bershel avatar Nov 18 '25 14:11 erik-bershel

Funny enough, I had this line in my workflow (run on ubuntu-latest) and it has started failing because the file no longer exists 😓

- run: sudo rm /var/lib/man-db/auto-update

Needs -f or more likely the run can now just be removed.

thom-nic avatar Nov 19 '25 17:11 thom-nic