StreamReader on local zip data - A local file header is corrupt
Problem originally brought up in this QC community post: https://www.quantconnect.com/forum/discussion/14110/streamreader-on-local-zip-data-a-local-file-header-is-corrupt/p1
Expected Behavior
After downloading new QC data that is distributed in a .zip file, I should be able to unzip those files using ZipArchive/StreamReader in a C# application, and then process that data.
Actual Behavior
I get the exception upon accessing the newly downloaded .zip files with StreamReader: “System.IO.InvalidDataException: 'A local file header is corrupt.'”
Example code: List<BarData> barDatas = new List<BarData>();
using (ZipArchive archive = ZipFile.OpenRead(barDataFilePath)) { foreach (ZipArchiveEntry entry in archive.Entries) { if (entry.FullName.EndsWith(".csv", StringComparison.OrdinalIgnoreCase)) { using (StreamReader streamReader = new StreamReader(entry.Open()))//Error occurs here { } } } }
Potential Solution
There may be some issue with installed .Net version, according to this dotnet issue: https://github.com/dotnet/runtime/issues/1094. I updated my .Net SDK with dotnet-sdk-6.0.400-win-x64, but it did not seem to fix the problem for me. Or if the team can suggest some other way of reading the bardata, since Lean CLI seems to read it just fine.
Workaround I found so far is to manually use WinRar's fix archive feature, or manually unzip/rezip the file.
Reproducing the Problem
Create some basic C# application with the below code, and provide it a path to a newly downloaded QC data file. Download bardata files with the Lean CLI console prompts. In my case I used GBPUSD in second resolution.
List<BarData> barDatas = new List<BarData>();
using (ZipArchive archive = ZipFile.OpenRead(barDataFilePath)) { foreach (ZipArchiveEntry entry in archive.Entries) { if (entry.FullName.EndsWith(".csv", StringComparison.OrdinalIgnoreCase)) { using (StreamReader streamReader = new StreamReader(entry.Open()))//Error occurs here { } } } }
System Information
Windows 10, .Net Framework 4.0
Hey @Midaroh! Thanks for the report.
Or if the team can suggest some other way of reading the bardata,
Lean is correctly being able to read the data without any issues, can reuse the lean compression implementation https://www.nuget.org/packages/QuantConnect.Compression/
Replaced by https://github.com/QuantConnect/Lean/issues/2435