zstd icon indicating copy to clipboard operation
zstd copied to clipboard

Potential Issue: Hang When Used with tar --use-compress-program on Large Files (>300 MB) in Windows PowerShell

Open denilly opened this issue 6 months ago • 4 comments

Description:
Hello zstd team!

I would like to report a potential issue when using zstd.exe as a compression program in a pipeline with the tar command on Windows, specifically in PowerShell. I am experiencing an indefinite hang when attempting to compress directories or large files using the command tar --use-compress-program 'zstd.exe -T0 -19' -cf. The issue occurs consistently when the total directory size exceeds approximately 350 MB or when a single file is larger than 300 MB.

Steps to Reproduce:

  1. Use the native tar.exe from Windows or the version from Git for Windows, along with zstd.exe (version 1.5.7) on Windows 10/11 or Windows Server.
  2. Create a test directory (e.g., C:\backup) with a total size exceeding 350 MB or a single file larger than 300 MB.
  3. Run the following command in PowerShell:
    tar --use-compress-program "zstd.exe -T0 -19" -cf C:\backup\backup.tar.zst -C C:\backup Backup_Folder
    
  4. Observe that the process hangs indefinitely (no explicit error, it just remains running).

Expected Behavior:
The command should complete the compression of the directory or large file, generating the backup.tar.zst file without hanging.

Observed Behavior:
The process hangs, consuming resources (CPU usage drops to 0% while RAM usage typically remains above 900 MB) with no visible progress. Interrupting with Ctrl+C is required. The same command works for smaller directories (<350 MB) or files (<300 MB).

Additional Tests:

  • Using two separate commands (tar -cf followed by zstd -T0 -19) resolves the issue, suggesting the hang may be related to the pipeline.
  • Reducing the compression level (e.g., -T1 -6) does not prevent the hang.
  • Processing in batches avoids the hang but is not an ideal solution for large volumes.
  • Tested with 7-Zip to create .tar and then compress with zstd.exe, which works, but I prefer to use tar for compatibility and to reduce the number of commands in scripts.
  • On Linux, using the command tar -I 'zstd -T0 -19' -cf with directories and files exceeding 1 GB, including a single file, did not result in any issues.

Environment:

  • Operating System: Windows 10/11 or Windows Server 2022
  • PowerShell: 5.1 or 7.x
  • tar.exe: bsdtar 3.7.7 - libarchive 3.7.7 zlib/1.2.13.1-motley liblzma/5.4.3 bz2lib/1.0.8 libzstd/1.5.5
  • zstd.exe: 1.5.7 (downloaded from https://github.com/facebook/zstd/releases)
  • Date: July 10, 2025

Question:
Could this be a known limitation of zstd when handling large data streams in pipelines on Windows? Or might it be a compatibility issue with the version of tar being used? I would appreciate any guidance or suggestions to work around this or confirm if it’s a bug worth investigating.

Thank you for your attention and for the excellent work on zstd!

denilly avatar Jul 10 '25 17:07 denilly

Have you tried other compression programs with the same pattern: tar --use-compress-program "someCompression --someParam" -cf ... ?

Cyan4973 avatar Jul 10 '25 18:07 Cyan4973

Hello!

In response to the question, i conducted additional tests to investigate whether the hang issue is specific to zstd or related to the use of --use-compress-program. Below are the details:

Tests Performed:

  1. With gzip.exe:
    • Executed the command:
      tar --use-compress-program "gzip.exe" -cf C:\backup\backup.tar.gz -C C:\backup Backup_Folder
      
    • Result: The process hung indefinitely, even with files smaller than 300 MB, replicating the behavior observed with zstd.exe -T0 -19. No explicit error was generated, and interruption with Ctrl+C was required.
    • Additional Test with Git tar:
      When using the tar.exe from Git for Windows, the following error occurred:
      PS C:\Program Files\Git\usr\bin> .\tar.exe --use-compress-program 'gzip.exe' -cvf "C:\backup\Backup.tar.gz" -C "C:\backup" "Backup_Folder"
      Backup_Folder/
      Backup_Folder/teste.txt
      tar (child): Cannot connect to C: resolve failed
      /usr/bin/tar: C\:\\backup\\Backup.tar.gz: Cannot write: Broken pipe
      /usr/bin/tar: Child returned status 128
      /usr/bin/tar: Error is not recoverable: exiting now
      
      This error suggests a failure in the pipeline or file handling, consistent with the hang observed in larger files.
  2. With tar -cvzf (internal gzip):
    • Executed the command:
      tar -cvzf C:\backup\backup.tar.gz -C C:\backup Backup_Folder
      
    • Result: The compression completed successfully, even with files larger than 350 MB, with no hangs or errors.

Analysis:

  • The success with tar -cvzf (using the internal gzip) contrasts with the hang and error when using --use-compress-program "gzip.exe", suggesting that the issue is not with the compressors themselves (gzip.exe or zstd.exe) but possibly with the external pipeline mechanism enabled by --use-compress-program in my environment (Windows 10/11, PowerShell 5.1/7.x, tar.exe 2.42.0.windows.2 from Git for Windows).
  • The error "Cannot connect to C: resolve failed" and "Broken pipe" with the Git tar.exe indicate a potential incompatibility or limitation in how the external compressor is invoked, especially on Windows paths.

Revised Question: Based on this, could the hang and error be a limitation or bug in tar.exe (Git for Windows) when using --use-compress-program with external pipelines on Windows/PowerShell? Or might there be a specific configuration of zstd.exe (version 1.5.7) or gzip.exe that causes this with large data streams or Windows paths? I appreciate any guidance on how to work around this or whether it should be reported to the tar project (Git for Windows).

Temporary Workaround: For now, I am using tar -cvzf for gzip or two separate commands (tar -cf followed by zstd.exe) to avoid the hang, but I would prefer to retain the flexibility of --use-compress-program if the issue can be resolved.

Thank you!

denilly avatar Jul 10 '25 19:07 denilly

Based on this, could the hang and error be a limitation or bug in tar.exe (Git for Windows) when using --use-compress-program with external pipelines on Windows/PowerShell?

That sounds likely

Cyan4973 avatar Jul 10 '25 23:07 Cyan4973

Maybe related https://github.com/libarchive/libarchive/issues/1419 ?

lazka avatar Jul 19 '25 04:07 lazka