mapshaper icon indicating copy to clipboard operation
mapshaper copied to clipboard

DBF or zip files greater than 4GB

Open zbychuk opened this issue 5 years ago • 7 comments

I have a DBF file which is 5GB. I guess because it exceeds 32 bit, it does not load in any browser. When I try to zip it (zip is only 230MB) it fails as it is 64 bit zip. Any chance for supporting such huge files?

zbychuk avatar Sep 27 '19 15:09 zbychuk

See #358

jsfenfen avatar Sep 27 '19 17:09 jsfenfen

I've been improving support for reading very large files in the command line tool. Web UI support for files >2GB is less feasible.

Currently, you can load very large GeoJSON, .shp and CSV files into the CLI, but .dbf file import is limited to 2GB on most systems.

I will try to increase the DBF import limit in the near future (in the CLI).

Outputting files >2GB will take more work. (Mapshaper tends to create smaller DBFs than other tools, so the output size limit is unlikely to be a problem. This is because other tools tend to allocate the maximum 254 bytes for each string field, whereas Mapshaper only allocates as many bytes as needed to contain the data).

mbloch avatar Sep 27 '19 19:09 mbloch

@mbloch It would be nice if we had a command line variable to override string length, no? I'm inserting strings into a TextField and not a CharField.

nitrag avatar Dec 13 '22 18:12 nitrag

@nitrag I'm not familiar with the TextField and CharField data types. Is this Django-specific terminology? If you want to store strings that are longer than 254 bytes, I think that you'll need to use a different file type than Shapefile/DBF.

mbloch avatar Dec 14 '22 03:12 mbloch

@mbloch Ah yes, Postgres native types are VARCHAR and TEXT I think.

I'm reading a Shapefile into mapshaper that has fields with text >254 so I'm not sure how that's possible. GeoJSON files that I'm inputting have this as well. Mapshaper is truncating the characters on save/output to a shapefile. From reading mapshaper documentation the 255 limit is for a specific purpose to externally load the resulting shapefile into a specialized GIS database? So feasibly, the restriction is isolated to that use case and not the shapefile spec itself? If I am understanding correctly.

nitrag avatar Dec 14 '22 14:12 nitrag

The original Shapefile specification limits text fields to 254 bytes (see http://switchfromshapefile.org/), but some GIS software has extended the Shapefile format in non-standard ways. Do you know how your Shapefile was created? Can you send me an example file containing longer text fields? I'll consider adding support for extensions to the Shapefile format for compatibility with other software.

mbloch avatar Dec 14 '22 14:12 mbloch

@mbloch Following up rather late on this but you are correct. It is limited to 254 and making it larger corrupts the file. My issue was converting a geojson to shapefile and losing the extra text. I have since moved to another method. Thanks!

nitrag avatar Feb 25 '23 20:02 nitrag