gtfs-lib icon indicating copy to clipboard operation
gtfs-lib copied to clipboard

Move to new LocationTech JTS

Open karussell opened this issue 7 years ago • 32 comments

JTS is now under the umbrella of locationtech and has a new release: https://github.com/locationtech/jts/

karussell avatar Feb 06 '18 10:02 karussell

@abyrd, do you see any issues with updating the JTS version?

landonreed avatar Feb 14 '18 15:02 landonreed

No, I wouldn't expect any problems with a minor version bump. But it would need to be tested before we use it in production.

abyrd avatar Feb 14 '18 15:02 abyrd

This project uses geotools, which still has some dependencies on JTS 1.14. It seems like the main use of geotools here is transforming points between projections. Would you consider switching to something like proj4j for projection, which would remove the dependency on geotools, not be LGPL, and not lock the JTS version?

willcohen avatar Feb 21 '18 19:02 willcohen

@karussell and @willcohen can you provide a little more context? Why are you concerned about the JTS version, were there major bugfixes in newer versions or are there conflicts with other dependencies in your projects?

Actually I don't know why gtfs-lib would need to project coordinates at all. If geotools/JTS is only used to transform between coordinate systems, I'd rather just remove these dependencies entirely. Very basic equirectangular projection (lat' = lat; lon' = lon * cos(lat)) is perfectly adequate for all our data validation situations.

As stated in the Javadoc for this class GeoUtils: GeoUtils ported from old GTFS validator. Does everything with JTS MathTransforms and UTM, which is needlessly complicated, so all functions are deprecated.

abyrd avatar Feb 22 '18 07:02 abyrd

I'm surveying the code for uses of Geotools. In GTFSFeed, GeodeticCalculator.orthodromicDistance is used in getInterpolatedStopTimesForTrip(). GeoUtils.getDistance() is used when measuring the length of a trip in meters. There is a ProjectedCoordinate class that doesn't seem to be used at all. There is a ReversedTripValidator that has not yet been reimplemented for the new validation framework, mostly because it contains no documentation explaining what it's supposed to do.

I think that's all the direct and indirect uses. All of them are just distance calculations, for which we have a standard, tolerably accurate method that doesn't involve rigorous projections.

abyrd avatar Feb 22 '18 07:02 abyrd

The move to JTS 1.15 had two major changes -- the first is that JTS got relicensed by LocationTech under EPL/EDL instead of LGPL, which means that all of the reverse engineering licensing issues that come with LGPL are not issues with 1.15 versus previous versions. Addressing this would also mean that using proj4j for projections or spatial4j for distance calcs has value when the full power of the LGPL geotools isn't needed for small spatial tasks. The second major change is that all the com.vividsolutions.jts classes got renamed to org.locationtech.jts, which makes 1.15 objects incompatible with previous JTS objects.

willcohen avatar Feb 22 '18 08:02 willcohen

Thanks for the summary @willcohen.

Our business is based on providing the public sector with software that uses transparent reproducible methods, so we have no reverse engineering concerns. We do like to see widespread reuse of open data though, and know that many businesses are wary of LGPL (for reasons I've never been able to fully understand).

I don't think this library exposes or returns any JTS objects via its intended public interface, so there should not be any issue with object incompatibility, where calling code that imports gtfs-lib as a dependency would be expecting JTS 1.15 objects. So it seems like the only real concern is gtfs-lib pulling in some unnecessary transitive dependencies. Can you confirm that?

Anyway, comments in our own code imply that we didn't really want to pull in JTS to do these distance calculations. We could certainly do it with another library or our usual equirectangular method. That's why I did the above survey of call sites to Geotools/JTS - I think we should eliminate this dependency rather than migrating to a newer version.

abyrd avatar Feb 22 '18 08:02 abyrd

Thanks for the response! To confirm, it's correct that the transitive dependency is the issue. Replacing the GeodeticCalculator.getOrthodromicDistance() and JTS.orthodromicDistance() with Util.fastDistance seems straightforward, which removes the need for geotools. GTFSFeed has a few other functions, though -- getMergedBuffers, getConvexHull -- which do seem like they're going to need something like JTS as long as they keep dealing in geometries. Does it seems right that geotools's dependency could go and be replaced with fastDistance, but JTS would need to stay, at which point it could more easily be version bumped?

willcohen avatar Feb 22 '18 09:02 willcohen

Our business is based on providing the public sector with software that uses transparent reproducible methods, so we have no reverse engineering concerns.

It is not just that. There are concerns&complications for LGPL on Android and although we do not use gtfs-lib on Android yet, we use JTS there and when all libraries use the latest JTS this would help reducing complexity and footprint.

I think we should eliminate this dependency rather than migrating to a newer version.

Sounds even better :)

karussell avatar Feb 22 '18 12:02 karussell

JTS 1.15 will also allow removing the dependency on the forked conveyal/jackson2-geojson, which would otherwise also need to get updated to support JTS 1.15. JTS can now read geojsons directly into Geometries and write geojsons from Geometries using GeoJsonReader and GeoJsonWriter.

willcohen avatar Feb 22 '18 16:02 willcohen

Thanks for this hint! As we currently have forked another repo and so we probably do not need this as well ...

karussell avatar Feb 22 '18 17:02 karussell

The GeoJson read/write functionality is quite convincing. I think we'll want to shift JTS versions it's just a matter of coordinating that change with our other projects that import gtfs-lib as a dependency.

abyrd avatar Mar 01 '18 16:03 abyrd

it's just a matter of coordinating that change with our other projects that import gtfs-lib as a dependency

Why does this matter? Please don't get me wrong. I mean, the other projects will still point to an old version of this lib and the first step is to release a new version here?

karussell avatar Mar 01 '18 16:03 karussell

Okay! Let me know how I can help. It seems like after removing geotools (I believe #103 accomplishes the goals listed here, but let me know if I misunderstood how you'd like to perform those calculations), changing the version should be a relatively straightforward task of changing the object groupids and moving away from conveyal/jackson2-geojson.

willcohen avatar Mar 01 '18 16:03 willcohen

On Mar 2, 2018, at 00:21, Peter [email protected] wrote:

it's just a matter of coordinating that change with our other projects that import gtfs-lib as a dependency

Why does this matter? Please don't get me wrong. I mean, the other projects will still point to an old version of this lib and the first step is to release a version here?

For better or worse, some of our other projects are very tightly coupled to gtfs-lib and its release cycle. It won’t necessarily cause problems but I need to discuss with Landon before we act.

abyrd avatar Mar 01 '18 16:03 abyrd

Just as a hint: the JSON reader&writer has the disadvantage of pulling in some old & inactive dependency:

<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>

So core is not sufficient and you need to use jts-io-common.

I prefer jackson here, maybe you could share what changes where necessary for your use cases to the bedatadriven/jackson-datatype-jts, as we use this too and could consider switching? Then we don't have this jackson-datatype-jts library duplicated and also we would not need to duplicate the json reader/writer library.

karussell avatar Mar 05 '18 16:03 karussell

On our side, now that I have thought this through a bit, the problem is that we actually use GeoTools quite heavily in some projects that import gtfs-lib. GeoTools transitively depends on JTS, and seems to still be using the "old" Vividsolutions JTS. So if gtfs-lib is changed to use the new JTS package scheme, a project using GeoTools also imports the old package scheme and the two are using different JTS objects. That doesn't seem good.

Is there any plan to move Geotools over to this new JTS? Is this new JTS somehow "official" or is it a fork?

abyrd avatar Mar 06 '18 07:03 abyrd

Is this new JTS somehow "official" or is it a fork?

Pretty official. I think @dr-jts is the tech director of Vivid Solutions and the core dev of the awesome old&new JTS :)

Is there any plan to move Geotools over to this new JTS?

I don't know but it would be strange if not at some point.

That doesn't seem good

Yes, but as they use different name space this should not be of any harm. Someone must do the first step :)

karussell avatar Mar 06 '18 08:03 karussell

It should not do any harm, unless we need to pass objects from one library to the other in which case I don't think it would work at all. We'd need to think through whether we'll run into any such situations.

abyrd avatar Mar 06 '18 08:03 abyrd

Ah, yes, true. But why would you need be required to upgrade now to a new gtfs-lib upstream? I've created a new issue at geotools: https://osgeo-org.atlassian.net/browse/GEOT-5954

karussell avatar Mar 06 '18 08:03 karussell

For better or worse, gtfs-lib is pretty tightly coupled to our data tools project. We are very actively modifying it to build out the relational database storage functionality. We would have to maintain two separate branches until we are ready to switch to new JTS. I see why you guys want to shift to the latest and greatest JTS, I just have to be wary of the repercussions.

abyrd avatar Mar 06 '18 08:03 abyrd

@karussell Yes, JTS is now officially hosted under LocationTech.

Hopefully GeoTools will switch to the new version (with new package naming). I do not have any ETA for that however.

dr-jts avatar Mar 06 '18 18:03 dr-jts

It appears that the geotools 20 release will include JTS 1.15.

willcohen avatar Jun 29 '18 20:06 willcohen

@abyrd do you have plans to remove geotools i.e. merge #103?

karussell avatar Sep 10 '18 22:09 karussell

And separately from #103, what other tools besides r5 would you also need to migrate to JTS 1.15 alongside gtfs-lib during this transition? Geotools 20-RC is now out and uses the newer JTS, so the Geotools 20 final release should be quite soon and it may be possible to start transitioning the other tools that will still be using Geotools to use JTS 1.15 as well.

willcohen avatar Sep 10 '18 22:09 willcohen

As I understand it, we can easily remove Geotools and depend on JTS directly. I appreciate that you've done the research and supplied a pull request for this.

However, as @willcohen mentions above, methods like getMergedBuffers and getConvexHull return JTS geometry objects. When #103 removes Geotools, it continues using JTS as a direct dependency, and doesn't make any changes to the methods you mentioned. If we update this JTS version, it will be incompatible with all our other software because it uses different package names.

We'd have to do an audit of all our software that uses gtfs-lib, directly or transitively, to see what will break. One is OpenTripPlanner, which imports R5, which imports gtfs-lib. OTP doesn't even need to import R5, so again we will probably just be removing code and dependencies, but I know that making any such changes and testing everything will still take several hours.

To be honest it doesn't seem likely to happen very soon, I don't think we have time to work on this at the moment if the main motivation is a more permissive license.

@landonreed what's your take on this? What would be the repercussions on our datatools project (among other things) of swapping out JTS to a newer version that uses new package names?

abyrd avatar Sep 12 '18 09:09 abyrd

I don't believe there would be any major consequences to bumping JTS in gtfs-lib for our Data Tools project. The use of JTS in Data Tools is relatively limited (much of it is dead code that is no longer being used for the editor -- except for migrating data from one representation to another) and doesn't appear to hand off JTS objects to or from gtfs-lib classes. We would need to test this once the PR branch is ready to go, but I don't immediately see any red flags.

landonreed avatar Sep 12 '18 19:09 landonreed

So, the methods that depend on JTS are extensions to the older MapDB-backed GTFSFeed, and are no longer needed by our new RDBMS-backed editor. Maybe this would be a good opportunity to remove some dead code then. Our datatools project (which is tightly coupled to gtfs-lib) is stable enough now that we should be able to do a new major-version breaking release of gtfs-lib after merging #103.

So @willcohen I talked to @landonreed and we think we could just eliminate both Geotools and JTS from gtfs-lib. If you update the PR #103 to do that we should be able to merge and release. We observed a test failing after the change due to different precision in the distance calculations - this is to be expected, and as long as changes are within reasonable tolerances, the test epsilons may be relaxed to make them pass.

abyrd avatar Sep 13 '18 15:09 abyrd

Okay! I probably can't do it today but I will look shortly!

willcohen avatar Sep 13 '18 16:09 willcohen

The above-mentioned RDBMS-backed data tools project is now stable. I think we're ready to start making this transition. Let's update PR #103 to completely remove JTS and GeoTools.

Note that Java 11 LTS is now out. GeoTools 20 and 20.1 are released, which depend on JTS 1.16. So we'll be updating some other projects that depend transitively on gtfs-lib, and it'll be good to eliminate these conflicting versions of GeoTools and JTS in gtfs-lib.

abyrd avatar Dec 13 '18 07:12 abyrd