NetTopologySuite.IO.ShapeFile
NetTopologySuite.IO.ShapeFile copied to clipboard
[BUG] UTF-8 read encoding error
Describe UTF-8 shapefile loaded with wrong encoding.
Code Snippet
using (var reader = new ShapefileDataReader("shape.shp", new GeometryFactory(),
Encoding.UTF8))
{
reader.Read();
}
Investigation
NetTopologySuite.IO.DbaseFileReader.CreateStreamProviderRegistry
method gets called atShapefileDataReader
constructor.
It passes encoding.EncodingName
to ByteStreamProvider
argument with "Unicode (UTF-8)"
value.
Later, when parsing the header in NetTopologySuite.IO.DbaseFileHeader.GetEncoding()
try
{
// The following line throws exception
// 'Unicode (UTF-8)' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
return DbaseEncodingUtility.GetEncodingForCodePageName(cpgText);
}
catch
{
return DefaultEncoding;
}
I think DbaseEncodingUtility.GetEncodingForCodePageName
should be called with "UTF-8" argument.
Registering encoding provider
Calling Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
does not fixes my issue.
Quick fix Setting the default header encoding.
DbaseFileHeader.DefaultEncoding = Encoding.UTF8;