json-machine
json-machine copied to clipboard
GeoJSON properties only
Hi!
I'm working with GeoJSON large files. An example of entry:
{
"type":"FeatureCollection",
"features":[
{
"type":"Feature",
"id":"500410000A0055",
"geometry":{
"type":"Polygon",
"coordinates":[
[
[
-1.8556355,
49.6650877
],
[
-1.8556206,
49.6650552
],
[
-1.8555928,
49.6650017
],
[
-1.8552808,
49.6650365
],
[
-1.85531,
49.6651397
],
[
-1.8555869,
49.6650975
],
[
-1.8556355,
49.6650877
]
]
]
},
"properties":{
"id":"500410000A0055",
"commune":"50041",
"prefixe":"000",
"section":"A",
"numero":"55",
"contenance":248,
"arpente":false,
"created":"1902-09-20",
"updated":"2021-03-09"
}
}
]
}
I only need some properties informations like contenance, section and numero. From now, I using this:
$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features']);
foreach ($datas as $key => $data) {
if ($data->properties->contenance == 248) {
echo $data->properties->commune;
echo $data->properties->section;
echo $data->properties->numero;
//Etc
}
}
But, I don't need geometry who takes a lot of time to load.
So I tried to use this more precise pointer:
$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features/-/properties']);
foreach ($datas as $key => $data) {
var_dump($data);
}
But JSON Machine don't return an object...
So, is it possible to pointer GeoJSON properties as an object with JSON Machine without load geometry?
Thanks!
Hi, thanks for the question, and sorry for the delay. I had much stuff going on in life 😄. I'll try to get to it asap. In the meantime, you can try dev-recursive version from #36 if it works for you. Let me know.
HI!
I tried this:
$datas = Items::fromFile('datas/50041.json', ['recursive' => true]);
foreach ($datas as $data) {
foreach ($data as $d) {
foreach ($d as $dKey => $dValue) {
if ($dKey === 'properties') {
if ($dValue->contenance == 248) {
echo $dValue->commune;
echo $dValue->section;
echo $dValue->numero;
}
}
}
}
}
... but it doesn't seem more efficient.
Is it the good way to use recursive ?
What exactly do you mean by efficient? Quicker or less memory usage?
This is the same execution time with the two methods.
Yes. The recursive method is there to lower memory consumption in bigger subtrees. The speed is more or less the same.
But JSON Machine don't return an object...
What does it return?
But, I don't need geometry who takes a lot of time to load.
Btw you can't avoid parsing geometry internally, as the parser has to read everything sequentially to know where it is. JSON is by by nature a format where you have to go through all the data to find what you need. There are no places in the data format known beforehand that mark where a parser can safely jump.
What does it return?
$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features/-/properties']);
foreach ($datas as $key => $data) {
var_dump($data);
}
Returns:
string(14) "500410000A0084"
string(5) "50041"
string(3) "000"
string(1) "A"
string(2) "84"
int(51040)
bool(false)
string(10) "1902-09-20"
string(10) "2021-03-09"
string(14) "500410000A0086"
string(5) "50041"
string(3) "000"
string(1) "A"
string(2) "86"
int(1885)
bool(false)
string(10) "1902-09-20"
string(10) "2021-03-09"
etc.
Can you please elaborate more on what's wrong about this output?
Recursive iteration #36 is now finished, merged, and released.
Can you please elaborate more on what's wrong about this output?
GeoJSON is a a standard format for geospatial data, and it would be convenient to directly access the properties of each entry:
$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features/-/properties']);
foreach ($datas as $key => $data) {
var_dump($data);
}
Oh, I see what you mean now. This is the result of the architecture of this library. The pointer specifies a structure, that will be iterated over. If you specify .../properties, you get an iterator over the insides of the properties. Support for different JSON file formats is currently out of the scope of this lib... If the original problem is, that geometry takes a long time to load, there's no way around it. The parser cannot jump, it has to read everything sequentially to find what you're looking for. One thing, that might help is to try /features/- pointer. The solution may involve PassThruDecoder, which skips decoding and may save you some time. You then decode only the properties key. Updating your example (not tested):
$feature = Items::fromFile('datas/50041.json', ['pointer' => '/features/-', 'decoder' => new PassThruDecoder]);
foreach ($feature as $key => $value) {
if ($key == 'properties') {
json_decode($value);
}
}
Edit: You may also want to try the recursive iteration
Updating your example
Nice, this is faster!
Edit: You may also want to try the recursive iteration
I will try, but you said than The speed is more or less the same, right?
Thanks a lot!
I will try, but you said than
The speed is more or less the same, right?
Right, it was just another suggestion in case the problem is also in unacceptable memory consumption of geometry.
(Also, after finishing the recursive iteration feature, performance tests show it's actually about 2x slower. See Readme.)