Files and folders with non-english characters names are have gibberish names after unpacking
Steps to reproduce
- Pack "слово.txt" by any archivator, ex. Bandizip
- See that any programm unpacks it correctly (ex. windows explorer or same program you packed it)
- Unpack it via `new FastZip().ExtractZip()1
Expected behavior
files and folders have same name as when they was packed
Actual behavior
files and folders name is like ����� �ணࠬ��
Version of SharpZipLib
1.4.0
Obtained from (only keep the relevant lines)
- Package installed using NuGet
Tryed actions:
ZipStrings.CodePage = 866; // No data is available for encoding 866. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
ZipStrings.CodePage = 1251; // same error
ZipStrings.CodePage = 65001; // gibberish
ZipStrings.UseUnicode = true; // gibberish
new FastZip {
EntryFactory = new ZipEntryFactory {
IsUnicodeText = true
}
}.ExtractZip(filepath, (TempFolder? "Temp\\" : ""), null);
(instead of just
new FastZip ().ExtractZip(filepath, (TempFolder? "Temp\\" : ""), null);
//not works.
IsUnicodeText = true gives same result as IsUnicodeText = false
IsUnicodeText = true gives same result as IsUnicodeText = false
This is because it's only used for creating entries, not when reading.
No data is available for encoding 866. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
This is because .NET only includes a very limited set of supported encodings. To add support for all the encodings present in .NET Framework, call this:
using System.Text;
// ...
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
In fact, if that is called, FastZip should automatically pick that encoding (as your OS is set to it).
Actually, I am going to reopen this, because the automatic encoding only works if
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
is called before any instance of ZipCodec has been accessed, and only on .NET FW. For .NET Core / 5+ it still only returns UTF-8. This should be fixed in an upcoming release.