json-machine icon indicating copy to clipboard operation
json-machine copied to clipboard

GeoJSON properties only

Open theamnesic opened this issue 1 year ago • 5 comments

Hi!

I'm working with GeoJSON large files. An example of entry:

{
   "type":"FeatureCollection",
   "features":[
      {
         "type":"Feature",
         "id":"500410000A0055",
         "geometry":{
            "type":"Polygon",
            "coordinates":[
               [
                  [
                     -1.8556355,
                     49.6650877
                  ],
                  [
                     -1.8556206,
                     49.6650552
                  ],
                  [
                     -1.8555928,
                     49.6650017
                  ],
                  [
                     -1.8552808,
                     49.6650365
                  ],
                  [
                     -1.85531,
                     49.6651397
                  ],
                  [
                     -1.8555869,
                     49.6650975
                  ],
                  [
                     -1.8556355,
                     49.6650877
                  ]
               ]
            ]
         },
         "properties":{
            "id":"500410000A0055",
            "commune":"50041",
            "prefixe":"000",
            "section":"A",
            "numero":"55",
            "contenance":248,
            "arpente":false,
            "created":"1902-09-20",
            "updated":"2021-03-09"
         }
      }
   ]
}

I only need some properties informations like contenance, section and numero. From now, I using this:


$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features']);

foreach ($datas as $key => $data) {

  if ($data->properties->contenance == 248) {

  	echo $data->properties->commune;
  	echo $data->properties->section;
  	echo $data->properties->numero;
  	//Etc
  	
  }

}

But, I don't need geometry who takes a lot of time to load. So I tried to use this more precise pointer:


$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features/-/properties']);

foreach ($datas as $key => $data) {
  var_dump($data);
}

But JSON Machine don't return an object...

So, is it possible to pointer GeoJSON properties as an object with JSON Machine without load geometry?

Thanks!

theamnesic avatar May 22 '24 13:05 theamnesic

Hi, thanks for the question, and sorry for the delay. I had much stuff going on in life 😄. I'll try to get to it asap. In the meantime, you can try dev-recursive version from #36 if it works for you. Let me know.

halaxa avatar Jul 11 '24 10:07 halaxa

HI!

I tried this:


$datas = Items::fromFile('datas/50041.json', ['recursive' => true]);

foreach ($datas as $data) {

  foreach ($data as $d) {

    foreach ($d as $dKey => $dValue) {

      if ($dKey === 'properties') {

        if ($dValue->contenance == 248) {

          echo $dValue->commune;
          echo $dValue->section;
          echo $dValue->numero;
        }

      }

    }

  }

}

... but it doesn't seem more efficient.

Is it the good way to use recursive ?

theamnesic avatar Sep 05 '24 13:09 theamnesic

What exactly do you mean by efficient? Quicker or less memory usage?

halaxa avatar Sep 05 '24 14:09 halaxa

This is the same execution time with the two methods.

theamnesic avatar Sep 05 '24 15:09 theamnesic

Yes. The recursive method is there to lower memory consumption in bigger subtrees. The speed is more or less the same.

halaxa avatar Sep 05 '24 16:09 halaxa

But JSON Machine don't return an object...

What does it return?

halaxa avatar Nov 03 '24 15:11 halaxa

But, I don't need geometry who takes a lot of time to load.

Btw you can't avoid parsing geometry internally, as the parser has to read everything sequentially to know where it is. JSON is by by nature a format where you have to go through all the data to find what you need. There are no places in the data format known beforehand that mark where a parser can safely jump.

halaxa avatar Nov 03 '24 15:11 halaxa

What does it return?

$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features/-/properties']);

foreach ($datas as $key => $data) {
  var_dump($data);
}

Returns:

string(14) "500410000A0084"
string(5) "50041"
string(3) "000"
string(1) "A"
string(2) "84"
int(51040)
bool(false)
string(10) "1902-09-20"
string(10) "2021-03-09"
string(14) "500410000A0086"
string(5) "50041"
string(3) "000"
string(1) "A"
string(2) "86"
int(1885)
bool(false)
string(10) "1902-09-20"
string(10) "2021-03-09"
etc.

theamnesic avatar Nov 19 '24 18:11 theamnesic

Can you please elaborate more on what's wrong about this output?

halaxa avatar Nov 22 '24 17:11 halaxa

Recursive iteration #36 is now finished, merged, and released.

halaxa avatar Nov 24 '24 17:11 halaxa

Can you please elaborate more on what's wrong about this output?

GeoJSON is a a standard format for geospatial data, and it would be convenient to directly access the properties of each entry:

$datas = Items::fromFile('datas/50041.json', ['pointer' => '/features/-/properties']);

foreach ($datas as $key => $data) {
  var_dump($data);
}

theamnesic avatar Nov 26 '24 10:11 theamnesic

Oh, I see what you mean now. This is the result of the architecture of this library. The pointer specifies a structure, that will be iterated over. If you specify .../properties, you get an iterator over the insides of the properties. Support for different JSON file formats is currently out of the scope of this lib... If the original problem is, that geometry takes a long time to load, there's no way around it. The parser cannot jump, it has to read everything sequentially to find what you're looking for. One thing, that might help is to try /features/- pointer. The solution may involve PassThruDecoder, which skips decoding and may save you some time. You then decode only the properties key. Updating your example (not tested):

$feature = Items::fromFile('datas/50041.json', ['pointer' => '/features/-', 'decoder' => new PassThruDecoder]);

foreach ($feature as $key => $value) {
  if ($key == 'properties') {
    json_decode($value);
  }
}

Edit: You may also want to try the recursive iteration

halaxa avatar Nov 26 '24 12:11 halaxa

Updating your example

Nice, this is faster!

Edit: You may also want to try the recursive iteration

I will try, but you said than The speed is more or less the same, right?

Thanks a lot!

theamnesic avatar Nov 27 '24 11:11 theamnesic

I will try, but you said than The speed is more or less the same, right?

Right, it was just another suggestion in case the problem is also in unacceptable memory consumption of geometry.

(Also, after finishing the recursive iteration feature, performance tests show it's actually about 2x slower. See Readme.)

halaxa avatar Nov 27 '24 13:11 halaxa