SharpZipLib icon indicating copy to clipboard operation
SharpZipLib copied to clipboard

Use SharpZipLib to parse a .xlsx file hit ‘Data descriptor signature not found’ error

Open Helen21356 opened this issue 2 months ago • 1 comments

Describe the bug

Use SharpZipLib to parse a .xlsx file hit ‘Data descriptor signature not found’ error:

Ask:

  1. Would like to understand what does this mean - what is the 'Data descriptor signature' code tried to get?
  2. Is there any plan to support such kind of file?

Exception details: Only '.xls' and '.xlsx' format is supported in reading excel file while error is ' at ICSharpCode.SharpZipLib.Zip.ZipInputStream.ReadDataDescriptor() at ICSharpCode.SharpZipLib.Zip.ZipInputStream.CompleteCloseEntry(Boolean testCrc) at ICSharpCode.SharpZipLib.Zip.ZipInputStream.BodyRead(Byte[] buffer, Int32 offset, Int32 count) at NPOI.OpenXml4Net.Util.ZipInputStreamZipEntrySource.FakeZipEntry..ctor(ZipEntry entry, ZipInputStream inp) at NPOI.OpenXml4Net.Util.ZipInputStreamZipEntrySource..ctor(ZipInputStream inp) at NPOI.OpenXml4Net.OPC.ZipPackage..ctor(Stream filestream, PackageAccess access) at NPOI.OpenXml4Net.OPC.OPCPackage.Open(Stream in1) at NPOI.Util.PackageHelper.Open(Stream is1) at NPOI.XSSF.UserModel.XSSFWorkbook..ctor(Stream is1) at Microsoft.DataTransfer.ClientLibrary.ExcelUtility.GetExcelWorkbook(String fileExtension, TransferStream stream)'. Data descriptor signature not found

For comparison:

  1. Python openpyxl can parse this file successfully. Looks like SharpZipLib has Stricter verification. https://pypi.org/project/openpyxl/
  2. Office Excel cannot open this .xlsx file successfully. Also we tried to 'Save as' from Excel application then SharpZipLib can handle it successfully.

Reproduction Code

No response

Steps to reproduce

using following sample code to read excel file with 'bad data':

        using (FileStream file = new FileStream(filePath, FileMode.Open, FileAccess.Read))
        {
            XSSFWorkbook workbook = new XSSFWorkbook(file);
            ISheet sheet = workbook.GetSheet("Page1_1");
            if (sheet == null)
            {
                Console.WriteLine("Sheet 'Page1_1' not found.");
                return;
            }
            for (int row = 0; row <= sheet.LastRowNum; row++)
            {
                IRow currentRow = sheet.GetRow(row);
                if (currentRow == null) continue;
                for (int col = 0; col < currentRow.LastCellNum; col++)
                {
                    var cell = currentRow.GetCell(col);
                    Console.Write((cell?.ToString() ?? "") + "\t");
                }
                Console.WriteLine();
            }
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine("Error reading Excel file:");
        Console.WriteLine(ex.StackTrace);
        Console.WriteLine(ex.Message);
        Console.WriteLine(ex.InnerException);
    }
}

Expected behavior

Can SharpXipLib supports reading this type of file

Operating System

No response

Framework Version

No response

Tags

No response

Additional context

No response

Helen21356 avatar Oct 16 '25 05:10 Helen21356

The Data Descriptor is part of the zip format that your corrupted file is missing (see: appnote:4.3.8). SharpZipLib probably could still read the file, but not using the streaming API (that OpenXml4Net seem to use), as it relies on the file being readable as a stream.

piksel avatar Dec 17 '25 11:12 piksel