playwright [Bug]: Unable to extract large report.zip

Version

1.40.0

Steps to reproduce

I have a large report zip file (550 MB) after CI fun with a lot of failures

npx playwright merge-reports --reporter html ./all-blob-reports Command failed with the following error

Error: invalid local file header signature: 0x1b8e4947 Screenshot 2024-02-12 at 11 14 52

Expected behavior

Should extract the large zip file

Actual behavior

Failed on the extracting zip file

Additional context

No response

Environment

System:
    OS: macOS 13.2.1
    CPU: (10) arm64 Apple M1 Max
    Memory: 2.48 GB / 32.00 GB
  Binaries:
    Node: 20.11.0 - ~/.nvm/versions/node/v20.11.0/bin/node
    Yarn: 1.22.21 - ~/.nvm/versions/node/v20.11.0/bin/yarn
    npm: 10.2.4 - ~/.nvm/versions/node/v20.11.0/bin/npm
  IDEs:
    VSCode: 1.86.0 - /usr/local/bin/code
  Languages:
    Bash: 3.2.57 - /bin/bash
  npmPackages:
    @playwright/test: ^1.40.0 => 1.40.1

Feb 12 '24 09:02 rkhomi

This looks unexpected! Is this reproducible all the time? Would it be possible to share a reproduction with us?

Do you transfer the zip files between different systems?

Feb 12 '24 20:02 mxschmitt

This looks unexpected! Is this reproducible all the time? Would it be possible to share a reproduction with us?

Do you transfer the zip files between different systems?

e2e_playwright:
    strategy:
      fail-fast: false
      matrix:
        shardIndex: [1, 2, 3, 4]
        shardTotal: [4]
        
       run: 
        docker compose exec -T playwright /devops/sh/pipeline/e2e_playwright.sh ${{ matrix.shardIndex }}/${{ matrix.shardTotal }} ${{ matrix.shardIndex }}

In e2e_playwright.sh yarn playwright test --project=mobile-"$SHARD_INDEX" --project=desktop-"$SHARD_INDEX"

in playwright.config.ts reporter: process.env.APP_ENV === 'pipeline' ? 'blob' : 'html',

It happened once in CI, As far as I understand from my testing here

I have 4 parallel Github jobs (matrix) and I was trying to run 2 projects in each job, and what I saw after running it deletes or overrides previously generated zip file from the first run, so it might have been corrupted during this process

Now I changed my script to see how it goes

export PWTEST_BLOB_DO_NOT_REMOVE=1
PWTEST_BLOB_REPORT_NAME=job-"$SHARD_INDEX" yarn playwright test --project=mobile-"$SHARD_INDEX"  --project=desktop-"$SHARD_INDEX"

Feb 12 '24 21:02 rkhomi

After Upgrading to 1.41.2 from 1.40.2. I also got the similar issue. I do confirm that 1.40.2 works fine but not in 1.41.2. I did revert the Playwright version to 1.40.2 for the workaround. I randomly got the two errors below from 1.41.2.

After upgrading to 1.41.2.

extracting: blob-report/xxx/blob-report/report-1.zip
Error: invalid local file header signature: 0x55f8eb30
    at /runner/_work/xxx/node_modules/playwright-core/lib/zipBundleImpl.js:1:30005
    at /runner/_work/xxxnode_modules/playwright-core/lib/zipBundleImpl.js:1:31700
    at /runner/_work/xxx/node_modules/playwright-core/lib/zipBundleImpl.js:1:17277
    at FSReqCallback.wrapper [as oncomplete] (node:fs:677:5)
Error: Process completed with exit code 1.

And sometimes it also did not return the error and pass it to the next command (mv command).

Start merge app
echo "export default { testDir: 'blob-report', reporter: [['html', { open: 'never' }]], };" > merge.config.ts
# find blob-report/app-e2e/blob-report before
blob-report/app-e2e/blob-report
blob-report/app-e2e/blob-report/report-1.zip
blob-report/app-e2e/blob-report/report-2.zip
# ./node_modules/.bin/playwright merge-reports --config=merge.config.ts ./blob-report/$project-e2e/blob-report
merging reports from /runner/_work/xxx/blob-report/app-e2e/blob-report
extracting: blob-report/app-e2e/blob-report/report-1.zip
...
# find blob-report/app-e2e/blob-report after
blob-report/app-e2e/blob-report/report-1.zip
blob-report/app-e2e/blob-report/report-2.zip
blob-report/app-e2e/blob-report/resources
blob-report/app-e2e/blob-report/report.jsonl
# mv playwright-report ./$project-html-report
mv: cannot stat 'playwright-report': No such file or directory

It produced report.jsonl and resources - not HTML report which is super weird. And there is no playwright-report folder after.

Feb 13 '24 10:02 jame-earnin

While looking at https://github.com/microsoft/playwright/compare/v1.40.0...v1.41.2 it could be caused by dc8ecc3ca404b211815c5541dfea6e59dbd19b9a.

@jame-earnin how large is your zip / project? Would it be possible to share a reproduction?

Feb 13 '24 10:02 mxschmitt

@mxschmitt

Feb 13 '24 11:02 jame-earnin

While it shows that the blob files are not extremely large, we unfortunately still need reproduction steps to understand what is happening.

To summarise:

It happens sometimes? How often or all the time?
v1.40.0 is good, v1.41.2. is bad?
All your blobs get created on the same OS - linux?

It produced report.jsonl and resources - not HTML report which is super weird. And there is no playwright-report folder after.

These are intermediate files, thats expected that they get created.

Feb 14 '24 09:02 mxschmitt

I'm seeing the same thing as @jame-earnin. It's not every time and it's hard to reproduce. I believe my issues started after updates to my sharding Actions workflow following the breaking changes introduced with actions/upload-artifact V4 and actions/download-artifact V4, and at first I thought it was an issue with the actions/download-artifact failing on large file sizes. I'm on Playwright 1.41.1.

This is the output from an Actions run that had three shards in it:

Current runner version: '2.313.0'
Operating System
 Ubuntu
 22.04.3
 LTS

Run actions/download-artifact@v4
Found 8 artifact(s)
Filtering artifacts by pattern 'all-blob-reports-*'
Preparing to download the following artifacts:
- all-blob-reports-2-program (ID: 1243390861, Size: 3158804)
- all-blob-reports-3-program (ID: 1243390177, Size: 628)
- all-blob-reports-1-program (ID: 1243390167, Size: 5242)
- all-blob-reports-2-review (ID: 1243390152, Size: 629)
- all-blob-reports-2-communication (ID: 1243389754, Size: 631)
- all-blob-reports-3-communication (ID: 1243389039, Size: 628)
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-fd82a3b8-38ce-5edb-b762-7ae5d074c0f5/artifacts/56d0db96bc72c2b9c9ef39089ac1e3a03d7350ea10380501b9f8331c633c8b56.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-d9a8428a-a04d-57fd-b073-bc6d7d850202/artifacts/debdab4df3055f16c055dbe9aafbdc5363cb84e8e1fb6c11f098f3b0c4be6738.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-bdfb5981-f02f-583f-ab5a-5a5bec73e8ff/artifacts/0ec1883a156ab10ee5e433ad534291e3ad403b45d2d73246ed041c628ee45538.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-f8c8d5b3-3e72-5f79-ccbe-15b471e4ad/artifacts/a827eec769ddb9eb0f85c8e5cfdd5989a319dfb92ee36bd4068e880e325e891a.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-f03aea1c-12fa-59-3f20-2c0fe93100a4/artifacts/f3b44b6b1530be177bb79843a1456bb12075c193d638cb91b345093f342e84.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-d5c870aa-6533-58f6-5b65-8ac822d71d/artifacts/2ee284d526cf8417ff0e2ed66a72a000c8b97318ed9c46f46a1c5d34dcc6d932.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
(node:1804) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Total of 6 artifact(s) downloaded
Download artifact has finished successfully
0s


Run PLAYWRIGHT_JUNIT_OUTPUT_NAME=junit.xml npx playwright merge-reports --reporter=junit ./all-blob-reports
merging reports from /home/runner/work/project/playwright/all-blob-reports
extracting: all-blob-reports/report-1.zip
extracting: all-blob-reports/report-2.zip
Error: invalid local file header signature: 0x0
    at /home/runner/work/project/playwright/node_modules/playwright-core/lib/zipBundleImpl.js:1:30005
    at /home/runner/work/project/playwright/node_modules/playwright-core/lib/zipBundleImpl.js:1:31700
    at /home/runner/work/project/playwright/node_modules/playwright-core/lib/zipBundleImpl.js:1:17701
    at FSReqCallback.wrapper [as oncomplete] (node:fs:686:5)
Error: Process completed with exit code 1

This is an example of another run where the merge step saying it's successful but the following step that looks for that merged report fails.

Run PLAYWRIGHT_JUNIT_OUTPUT_NAME=junit.xml npx playwright merge-reports --reporter=junit ./all-blob-reports
merging reports from /home/runner/work/project/playwright/all-blob-reports
extracting: all-blob-reports/report-1.zip
0s


Run dorny/test-reporter@v1
Check runs will be created with SHA=5a920bcd1ee374943b53ff4f5413bd93db695a
Listing all files tracked by git
Found 89 files tracked by GitHub
Using test report parser 'java-junit'
Creating test report Playwright Test Report review-board
Error: No test report files were found

Unfortunately this is in a giant project at work that I can't post here but I'll try to get some reproducible code up.

Feb 14 '24 22:02 angelo-loria

Well, I am unable to reproduce this issue in a personal project of mine. I also reverted the project in which this is happening in to 1.40.0 and 1.40.2 and I experience the issue in both versions. I've tried larger ubuntu runners (8-core) with no luck.

Feb 20 '24 15:02 angelo-loria

I have 4 parallel Github jobs (matrix) and I was trying to run 2 projects in each job, and what I saw after running it deletes or overrides previously generated zip file from the first run, so it might have been corrupted during this process

Sounds like there is an issue with the zip files. It's unclear wether the problem is caused by a problem in parallel jobs configuration, upgrade to upload/download-artifacts to v4 or some resource constraints (too big files) or something in playwright. We need a repro to take an action on this. If you see the error on one of the files it's likely a broken zip and extracting it manually will also fail.

Feb 20 '24 20:02 yury-s

We need more information to act on this report. Please file a new one and link to this issue when you get back to it!

Feb 26 '24 18:02 yury-s

I have run into this exact issue, after upgrading to using upload/download-artifacts v4 and using merge-reports. I'm not able to share more code or the artifacts themselves.

These are the artifacts generated:

The workflow file runs a very specific scenario, where I shard into multiple jobs depending on a spec list (array of strings containing the paths to the spec), and then run max-parallel of 1, to guarantee that each block of specs runs sequentially, which contains the following steps:

jobs:
  test:
    timeout-minutes: 1440
    runs-on: self-hosted
    strategy:
      fail-fast: false
      max-parallel: 1
      matrix:
        shard: ${{ fromJSON(github.event.inputs.specs) }}
    steps:
    - uses: actions/checkout@v4
    - name: Setup Node
      uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: Install dependencies
      run: npm ci
    - name: Install Playwright Browsers
      run: npx playwright install --with-deps
    - name: Run Playwright tests
      run: npx playwright test ${{ matrix.shard }} -c playwright.config.ts
    - name: Get random number to use in report
      if: always()
      id: generate_number
      run: echo "random_number=$(echo $RANDOM)" >> $GITHUB_OUTPUT
      shell: bash
    - name: Upload report blob
      uses: actions/upload-artifact@v4
      if: always()
      with:
        name: blob-reports-${{ steps.generate_number.outputs.random_number }}
        path: blob-report
        retention-days: 1
  merge-reports:
    runs-on: self-hosted
    if: always()
    needs: [test]
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: Install dependencies
      run: npm ci
    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
        merge-multiple: true
    - name: Merge into HTML Report
      run: npx playwright merge-reports --reporter html ./all-blob-reports # outputs to playwright-report folder
  ...

test job will run a matrix, that generates N blob-reports-RANDOM_NUMBER artifacts. merge-reports then downloads the artifacts with that pattern, with merge-multiple enabled.

But merging fails, with the exact same problem as other users mentioned:

I read that this error might occur when trying to unzip something that is not a zip file.

Downloading the files locally, extracting them into the all-blop-reports folder while renaming all of the "report.zip" files to "report.zip", "report (1).zip", "report (2).zip" and so on, then running npx playwright merge-reports --reporter HTML ./all-blob-reports works, just does not work in CI:

downloaded artifacts:
extracted report.zip files:
merging reports, report working just fine locally:

These artifact sizes are normal, have opened reports before that were 3x their accumulated sizes.

Mar 01 '24 13:03 FranciscoKnebel

Tested with smaller blobs, same result, error on the same lines. The extraction changed with download-artifact@v4, so perhaps the extracted files are now no longer zipped, and it's trying to unzip something that is not a zip file?

Mar 18 '24 19:03 FranciscoKnebel

error is thrown here, in openReadStream, in playwright-core/lib/zipBundleImpl.js:

Mar 18 '24 19:03 FranciscoKnebel

Some recent investigation on our end showed that this might be caused by the following:

Run gets cancelled
npx playwright test did not finish writing all the blobs
if: always() kicks in and uploads a broken blob
From there on things don’t work anymore.

as per here we should change

if: always()

to

if: ${{ !cancelled() }}

Which should fix this issue. Would appreciate if you could try testing it before we roll it out across docs/create-playwright. Its a very early investigation, so haven’t even tried it but looks promising. Thanks!

Mar 20 '24 01:03 mxschmitt

Some recent investigation on our end showed that this might be caused by the following:

Run gets cancelled

npx playwright test did not finish writing all the blobs

if: always() kicks in and uploads a broken blob

From there on things don’t work anymore.

as per here we should change
if: always()
to
if: ${{ !cancelled() }}
Which should fix this issue. Would appreciate if you could try testing it before we roll it out across docs/create-playwright. Its a very early investigation, so haven’t even tried it but looks promising. Thanks!

Thanks for the investigation. I ran some tests to confirm this, and it no longer fails. However, there might be another problem with merging the reports. Following test split in 3 blobs:

To run tests, updated actions/upload-artifact@v4 step to use !cancelled:

test
  (...)
   - name: Install Playwright Browsers
      run: npx playwright install --with-deps
   - name: Run Playwright tests on demo
      run: npx playwright test ${{ matrix.shard }} -c playwright.config.ts
   - name: Upload report blob
      uses: actions/upload-artifact@v4
      if: ${{ !cancelled() }}
      with:
        name: blob-reports-${{ steps.generate_number.outputs.random_number }}
        path: blob-report
        retention-days: 1

And on the second job, to merge the reports:

merge-reports:
    runs-on: self-hosted
    permissions:
      id-token: write
      contents: read
      pull-requests: write
    if: ${{ !cancelled() }}
    needs: [test]
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: Install dependencies
      run: npm ci
    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
        merge-multiple: true
    - name: Merge into HTML Report
      run: npx playwright merge-reports --reporter html ./all-blob-reports # outputs to playwright-report folder
      (...)
    - name: Upload HTML report to S3
      run: aws s3 sync ./playwright-report s3://automation-reports.ggoutfitters.com/playwright/${{ github.run_id }}/

Can see from the first image, compared to the local execution in https://github.com/microsoft/playwright/issues/29451#issuecomment-1973228011, that it looks like the merge didn't finish the process before passing to the next step in the job.

In the second image, you can see the download of the artifacts and that it started to run the merge-reports command, but the playwright-report folder wasn't created.

I'm trying again and going to list the contents in the directory before the final report upload attempt, to see if it's reproducible.

EDIT:

Another attempt, now listing directory content:

Mar 20 '24 03:03 FranciscoKnebel

Hey @mxschmitt Thanks for reopening this issue, it's really important to me that this gets fixed so we can start using the test reports again. I took a look in the referenced PRs and tested the zip integrity as well:

Maybe this problem is with the merge-multiple: true option used with download-artifact@v4 ?

Did a follow-up test without the merge-multiple option:

    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
    - name: Report Integrity check
      shell: bash
      run: |
        for file in all-blob-reports/*.zip; do
          unzip -t $file
        done
    - name: Merge into HTML Report
      run: npx playwright merge-reports --reporter html ./all-blob-reports

and the integrity check passed:

I'm doing another test next, just need to have unar installed in my self-hosted runner, where I'll do the merging manually. Locally it worked, going to confirm if this works in the runner:

    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
    - name: Report Integrity check
      shell: bash
      run: |
        for file in all-blob-reports/*.zip; do
          unzip -t $file
        done
    - name: Extract blob artifacts
      shell: bash
      run: |
        for z in all-blob-reports/*.zip; do unar -r "$z" -o reports; done
    - name: Merge into HTML Report
      if: ${{ !cancelled() }}
      run: npx playwright merge-reports --reporter html ./reports # outputs to playwright-report folder

This will download all the blobs into all-blob-reports. Those will be multiple blob-reports-#.zip, each containing a report.zip file. Haven't found a way using unzip to handle renaming of these zip files, but unar does. for z in all-blob-reports/*.zip; do unar -r "$z" -o reports; done will extract all the artifact files, saving report.zip, report-1.zip, report-2.zip and so on, which playwright merge-reports handles.

Screenshot 2024-03-22 193805

I'll confirm if this worked.

Mar 22 '24 22:03 FranciscoKnebel

Maybe related to https://github.com/actions/download-artifact/issues/298

Mar 22 '24 22:03 mxschmitt

I tried to reproduce with this workflow, but was not able to: https://github.com/mxschmitt/test/actions/runs/8397555420/job/23001044584

Mar 22 '24 23:03 mxschmitt

playwright playwright copied to clipboard

[Bug]: Unable to extract large report.zip

Version

Steps to reproduce

Expected behavior

Actual behavior

Additional context

Environment

playwright
playwright copied to clipboard