geojson-spec
geojson-spec copied to clipboard
Compact GeoJSON Notation
Proposal for a More Compact GeoJSON Encoding
Background
In many geospatial datasets, the structure of properties is highly regular, meaning that most features share the same set of properties. This proposal aims to introduce a more compact encoding for GeoJSON that optimizes data transfer and storage while maintaining compatibility with existing tools and workflows.
Current Approach Using GeoJSON (Within Specification)
Currently, GeoJSON does not provide a standardized method for property key deduplication. However, a common approach to achieving more compact representations while staying within the existing specification is as follows:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [139.6917, 35.6895] },
"properties": { "values": ["Tokyo", 37400068] }
},
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [-74.006, 40.7128] },
"properties": { "values": ["New York", 8419600] }
}
],
"propertyKeys": ["city", "population"]
}
This method introduces a propertyKeys
array at the FeatureCollection level to define the property names, while each feature stores only the corresponding values
array. While this approach reduces redundancy, it has several limitations:
-
Extra nesting: The
values
array introduces an unnecessary hierarchical level underproperties
. -
Non-standard propertyKeys: The
propertyKeys
field is not officially defined in the GeoJSON specification, making its usage unclear in standard-compliant software.
Proposed More Compact GeoJSON Encoding
To further optimize property storage, we propose a direct mapping of properties as an array at the properties
level, eliminating the need for the values
wrapper:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [139.6917, 35.6895] },
"properties": ["Tokyo", 37400068]
},
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [-74.006, 40.7128] },
"properties": ["New York", 8419600]
}
],
"propertyKeys": ["city", "population"]
}
Advantages of This Proposal
-
Eliminates Unnecessary Nesting: The
values
wrapper is removed, reducing structural overhead. -
Preserves Semantic Meaning: The
propertyKeys
array still defines the order of properties, ensuring that property interpretation remains clear. - More Efficient Encoding: By flattening the structure, we reduce unnecessary object syntax, leading to reduced file size.
Compression Efficiency Demonstration
Using actual data from the Japanese Land Price Public Announcement dataset (L01-24), we observed the following size reductions:
- Original GeoJSON: 151,799 KB
-
Current Method (
values
underproperties
): ~39,716 KB (approx. 73% reduction) -
Proposed Method (
properties
as array): Further reduced with additional optimizations -
Compressed (ZIP) Comparison:
- Original GeoJSON (ZIP): 7,371 KB
- Proposed Method (ZIP): 3,904 KB (approx. 47% reduction)
Implementation and Compatibility
A conversion tool has been developed to facilitate encoding in this proposed format:
- Conversion Tool: [GeoJSON Compact Converter](https://github.com/satakagi/compact-geojson-and-csv-converter/blob/main/index.html)
- Verification with Actual Data: [Test Results](https://github.com/gislab-mlit/next-ksj-formats/issues/26)
Existing GeoJSON consumers that rely on standard properties
as key-value objects may need adaptation. However, software designed to handle structured datasets can easily incorporate this encoding.
Conclusion
This proposal presents a structured yet efficient way to encode GeoJSON properties for datasets where the property schema is uniform across features. Given the prevalence of such structured datasets, this approach can offer significant efficiency gains without compromising clarity or compatibility.
Feedback and discussion are welcome to refine this approach further.