Lean icon indicating copy to clipboard operation
Lean copied to clipboard

StreamReader on local zip data - A local file header is corrupt

Open Midaroh opened this issue 3 years ago • 1 comments

Problem originally brought up in this QC community post: https://www.quantconnect.com/forum/discussion/14110/streamreader-on-local-zip-data-a-local-file-header-is-corrupt/p1

Expected Behavior

After downloading new QC data that is distributed in a .zip file, I should be able to unzip those files using ZipArchive/StreamReader in a C# application, and then process that data.

Actual Behavior

I get the exception upon accessing the newly downloaded .zip files with StreamReader: “System.IO.InvalidDataException: 'A local file header is corrupt.'”

Example code: List<BarData> barDatas = new List<BarData>();

using (ZipArchive archive = ZipFile.OpenRead(barDataFilePath)) { foreach (ZipArchiveEntry entry in archive.Entries) { if (entry.FullName.EndsWith(".csv", StringComparison.OrdinalIgnoreCase)) { using (StreamReader streamReader = new StreamReader(entry.Open()))//Error occurs here { } } } }

Potential Solution

There may be some issue with installed .Net version, according to this dotnet issue: https://github.com/dotnet/runtime/issues/1094. I updated my .Net SDK with dotnet-sdk-6.0.400-win-x64, but it did not seem to fix the problem for me. Or if the team can suggest some other way of reading the bardata, since Lean CLI seems to read it just fine.

Workaround I found so far is to manually use WinRar's fix archive feature, or manually unzip/rezip the file.

Reproducing the Problem

Create some basic C# application with the below code, and provide it a path to a newly downloaded QC data file. Download bardata files with the Lean CLI console prompts. In my case I used GBPUSD in second resolution.

List<BarData> barDatas = new List<BarData>();

using (ZipArchive archive = ZipFile.OpenRead(barDataFilePath)) { foreach (ZipArchiveEntry entry in archive.Entries) { if (entry.FullName.EndsWith(".csv", StringComparison.OrdinalIgnoreCase)) { using (StreamReader streamReader = new StreamReader(entry.Open()))//Error occurs here { } } } }

System Information

Windows 10, .Net Framework 4.0

Midaroh avatar Aug 30 '22 04:08 Midaroh

Hey @Midaroh! Thanks for the report.

Or if the team can suggest some other way of reading the bardata,

Lean is correctly being able to read the data without any issues, can reuse the lean compression implementation https://www.nuget.org/packages/QuantConnect.Compression/

Martin-Molinero avatar Sep 01 '22 15:09 Martin-Molinero

Replaced by https://github.com/QuantConnect/Lean/issues/2435

Martin-Molinero avatar Feb 17 '23 20:02 Martin-Molinero