geos icon indicating copy to clipboard operation
geos copied to clipboard

Don't initialize CoordinateSequence in constructor

Open eyal0 opened this issue 3 years ago • 2 comments

tl;dr

No need to initialize the CoordinateSequence when we construct a new one because we never used to do that.

The new code (d463bcb6df8fc1f01f276b27719ee8ccfb2d788a)

https://github.com/libgeos/geos/blob/d463bcb6df8fc1f01f276b27719ee8ccfb2d788a/src/geom/CoordinateSequence.cpp#L68-L79

In the code above, we see that if the CoordinateSequence is generated by calling with two arguments, sz and dim, then initialize is always called. It calls fillVector which calls std::fill and loops through the size of the new vector to initialize all elements to 0.

I think that this is not necessary.

The old code (60edf0e5f040e065193bd5beb79f7fa8febb1cd5)

Looking at commit 60edf0e5f040e065193bd5beb79f7fa8febb1cd5 we see that creating the CoordinateSequence directly like this is instead of calling getGeometrySequenceFactory()->create(sz, dim).

https://github.com/libgeos/geos/commit/60edf0e5f040e065193bd5beb79f7fa8febb1cd5#diff-95f3875df536135fe955deecfaeab0af5f6af52008d0fd5a56a3ed1b6638aca1L2427-R2429

getCoordinateSequenceFactory is a method of the GeometryFactory class:

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/GeometryFactory.h#L446-L449

It returns a CoordinateSequenceFactory, which has a create method. It's here in the old code:

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/CoordinateSequenceFactory.h#L87-L98

We see that it is a pure virtual function so it must be implemented by inheritors of this class.

There are only two classes that inherit from this class, the DefaultCoordinateSequenceFactory and the CoordinateArraySequenceFactory:

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/CoordinateArraySequenceFactory.h#L42

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/DefaultCoordinateSequenceFactory.h#L24

Let's look at them one-by-one

CoordinateArraySequenceFactory

Here's the implementation of create(size_t, size_t):

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/CoordinateArraySequenceFactory.h#L69-L74

It's creating a CoordinateArraySequence using the constructor and returning a unique_ptr. Here's that constuctor:

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/src/geom/CoordinateArraySequence.cpp#L37-L42

It is resizing the vector to make it the right size but not initializing anything.

DefaultCoordinateSequenceFactory

Here's the implementation of create(size_t, size_t):

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/DefaultCoordinateSequenceFactory.h#L43-L53

It's making a FixedSizeCoordinateSequence,templated on the number of dimensions. It's like a fixed size list, like a tuple! Let's look at that constuctor:

https://github.com/libgeos/geos/blob/0e8d4368b8bd72a7d361286e8523ebce5cff6146/include/geos/geom/FixedSizeCoordinateSequence.h#L35

We see that it is just coping the dimension. It is not initializing anything.

Conclusion

For all cases of create(size_t, size_t), the code before 60edf0e5f040e065193bd5beb79f7fa8febb1cd5 did not initialize the coordinates. The new code could also skip the initialization to maintain the same behavior. This would also improve performance because we'd eliminate the slow call to std::fill.

eyal0 avatar Dec 08 '22 04:12 eyal0

the code before https://github.com/libgeos/geos/commit/60edf0e5f040e065193bd5beb79f7fa8febb1cd5 did not initialize the coordinates

We didn't initialize them explicitly, but they were still initialized. If I create a std::vector<Coordinate>(5) or std::array<Coordinate, 5>, the five elements are initialized using the default constructor for Coordinate.

That said, it should be possible to skip the initialization in many places. At one point I added an initialize flag to the the constructor with a dim parameter. But I think it's better to move in the direction of eliminating this constructor entirely, because the dim flag is ambiguous (does a value of 3 indicate XYZ or XYM?)

dbaston avatar Dec 08 '22 13:12 dbaston

Ah, you are right! That explains why I had to initialize the first one in the PR.

If you can eliminate the usage of dim==0 (unknown) then you can do it.

eyal0 avatar Dec 08 '22 14:12 eyal0