paths-filter icon indicating copy to clipboard operation
paths-filter copied to clipboard

Files changed in a PR are too many to be fully detected by dorny/paths-filter

Open hwhsu1231 opened this issue 1 year ago • 2 comments

Problem Description

Recently, I found that when using dorny/paths-filter@v3 (currently https://github.com/dorny/paths-filter/commit/de90cc6fb38fc0963ad72b210f1f284cd68cea36), if a PR contains too many files changed, it seems that dorny/paths-filter@v3 will miss some of the files during the filtering process, leading to incorrect filtering results.

What happened?

Over the past few months, I have been trying to create an automation project that localizes CMake documentation.

Workflow to create a PR

First, I wrote a workflow file named ci-sphinx-update.yml, which essentially executes the following steps in order:

  1. Generate/Update .pot files from running sphinx-build command with gettext builder
  2. Generate/Update .po files from .pot files by running msgcat or msgmerge command
  3. Create a PR from a feature branch to the master branch by peter-evans/create-pull-request@v6

Thus, theoretically, if this workflow runs from scratch to generate .pot/.po files, the generated PR should include both .pot and .po files. This is indeed what appears from the output logs of peter-evans/create-pull-request@v6. Below is a part of the log extracted from my output. From it, we can see that this PR had a total of 4182 files changed.

Click to expand the log of 'peter-evans/create-pull-request@v6'
  [343874be-8420-4a7d-aea3-c36361be72f7 3a5b420f6] pot(3.1): Update pot from Sphinx
   Author: docs-l10n[bot] <157310748+docs-l10n[bot]@users.noreply.github.com>
   4182 files changed, 245931 insertions(+)
   create mode 100644 l10n/3.1/crowdin.yml
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_compile_options.po
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_command.po
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_target.po
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_definitions.po
   ...
   ...
   ...
   create mode 100644 l10n/3.1/pot/variable/PROJECT_VERSION_TWEAK.pot
   create mode 100644 l10n/3.1/pot/variable/UNIX.pot
   create mode 100644 l10n/3.1/pot/variable/WIN32.pot
   create mode 100644 l10n/3.1/pot/variable/WINCE.pot
   create mode 100644 l10n/3.1/pot/variable/WINDOWS_PHONE.pot
   create mode 100644 l10n/3.1/pot/variable/WINDOWS_STORE.pot
   create mode 100644 l10n/3.1/pot/variable/XCODE_VERSION.pot
   create mode 100644 l10n/3.1/version.json

Workflow to check status

Next, I also wrote a workflow file named ci-check-status.yml, which uses dorny/paths-filter@v3 to filter .pot files as follows:

- name: Check for *.pot files changed
  id: filter
  if: ${{ steps.evprt.outputs.VERSION != '' }}
  uses: dorny/paths-filter@v3
  with:
    filters: |
      pot:
        - 'l10n/${{ steps.evprt.outputs.VERSION }}/pot/**'

However, when ci-check-status.yml was triggered and attempted to filter the .pot files changed in the PR, I found that it nearly missed all .pot files, thus returning a false result. Below is a part of the log extracted from my output. From it, we can see that dorny/paths-filter@v3 only detected 3000 fiels changed.

Click to expand the log of 'dorny/paths-filter@v3'
Run dorny/paths-filter@v3
  with:
    filters: pot:
    - 'l10n/3.1/pot/**'
  
    token: ***
    list-files: none
    initial-fetch-depth: 100
Fetching list of changed files for PR#446 from Github API
  Invoking listFiles(pull_number: 446, per_page: 100)


  Received 100 items
  [added] l10n/3.1/crowdin.yml
  [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_compile_options.po
  [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_command.po
  [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_target.po
  ...
  ...
  ...
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_LINKER_FLAGS.po
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_LINKER_FLAGS_CONFIG.po
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_MODULE_PREFIX.po
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_MODULE_SUFFIX.po
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
Detected 3000 changed files
Results:
Filter pot = false
  Matching files: none
Changes output set to []

Conclusion

From the output logs of the two workflows, I infer that because the PR contains too many files changed (a total of 4182 files changed), dorny/paths-filter@v3 is unable to load all the files changed (it detected a maximum of 3000 files changed).

My questions are as follows:

  1. Is my inference correct?
  2. If so, how can I solve this problem?
  3. If it cannot be solved, is it considered a bug?
  4. If it's indeed a bug, I hope it could be fixed as soon as possible.

hwhsu1231 avatar Mar 07 '24 17:03 hwhsu1231

Might be related to: https://github.com/orgs/community/discussions/57830

hwhsu1231 avatar Mar 08 '24 02:03 hwhsu1231

I ran into this as well -- I've not had time to dig into it very much yet, but I did observe that falling back to the git-based change detection sidesteps the issue.

To do that, simply set the token param to an empty string:

      - uses: dorny/paths-filter@v3
        id: filter
        with:
          token: ''

kelchm avatar Apr 07 '24 02:04 kelchm