chardetsharp
chardetsharp copied to clipboard
Encoding not detected
The encoding for the the following file is not detected: https://onedrive.live.com/redir?resid=DBC75114109D3C7E!416&authkey=!AAaNrJIJijrzPCo&ithint=file%2czip
But Visual Studio detect it:
Here following my code snippet:
public static Encoding GetEncodingCharDet(string filename)
{
Encoding enc = null;
byte[] buf = new byte[4096];
UniversalDetector detector = new UniversalDetector();
using (FileStream fs = File.OpenRead(filename))
{
int nread;
while ((nread = fs.Read(buf, 0, buf.Length)) > 0)
{
detector.HandleData(buf);
}
detector.DataEnd();
String encoding = detector.DetectedCharsetName;
if (encoding != null)
{
enc = System.Text.Encoding.GetEncoding(encoding);
}
else
{
//Not Detected!
}
}
// (5)
detector.Reset();
return enc;
}
Is it a bug or something wrong in my code?
It's a bug. The detector currently seems to return null for everything. I only recently resurrected this code when Google Code closed down, after having sat stagnant for several years.