GeoDataFrames.jl
GeoDataFrames.jl copied to clipboard
DataFrame -> ESRI Shapefile: UTF-8/16 mangled to ?????????. DataFrame -> CSV: UTF-8/16 mangled to Latin-1 characters
Hello!
Firstly, thank you for this package!
At work, we do a lot of stuff with Esri and we deal with shape files a lot. Initially I've read in a CSV file and a Shapefile into 2 different dataframes and combined them with vcat. The CSV file is from calculated the centroids for a polygon on a layer we have on ArcGIS, the shapefile contains points from another source:
import GeoDataFrames as GDF
centroid_df = GDF.read("/home/my-user/centroids.csv")
point_df = GDF.read("/home/my-user/points.shp")
combined_df = vcat(centroid_df, point_df, cols=:union)
GDF.write("/home/my-user/combined_points.shp", combined_df)
This does create a valid shapefile, but any columns that contain rows with items that are in or have Mandarin or Cyrillic script are shown as ????????? or "?????????" whenever I load the new combined shapefile with GeoDataFrames or into ArcGIS.
This is similar to writing to a CSV file, even with options=Dict("bom"=>"true"), in that Mandarin and Cyrillic script characters are mangled to seemingly Latin-1 characters:
# same dataframes as above
GDF.write("/home/my-user/combined_points.csv", combined_df, options=Dict("bom"=>"true"))
Is there an option I can pass to the driver for shapefiles, is there something I'm missing for both drivers, or is there something else I can do?