gtfs-validator icon indicating copy to clipboard operation
gtfs-validator copied to clipboard

GTFS Validator as library

Open damianVarela opened this issue 4 years ago • 8 comments

Are there any plans on making this awesome validator a Java Library we can import thru maven?

damianVarela avatar Feb 17 '21 18:02 damianVarela

👋 for us at MobilityData this hasn't come up as a priority. We have captured this idea haven't planned work on this yet on our side but we would be very happy to see others contribute something.

carlfredl avatar Feb 22 '21 19:02 carlfredl

As we discussed, @barbeau noted that in theory anyone could use the current JAR file as a dependency in their Java project (whether or not it's available on Maven Central) and have access to the Java objects that are currently implemented. Most of the classes should be public, accessible as a dependency. Without looking further we don't know how practical it would be as-is, given it's design as it wasn't specifically designed as a library. It might work as-is.

carlfredl avatar Feb 22 '21 20:02 carlfredl

There are a couple use cases we could think of. Do you mind sharing yours @damianVarela ?

Integration of objects into java application use case Library would allow you to integrate the code/objects directly into your Java application without having to consume the JSON output of the validator. This would be most useful for people that want to extend the validator to implement their own custom rules and notices. For example, on the GTFS-realtime validator, there was a split into a library and application component to make it easier to use as a library - https://github.com/CUTR-at-USF/gtfs-realtime-validator#building-the-project.

MobilityData's higher priority in this direction is creating the functionality of "profiles" connected to various additional validation rules to give data producers transparency on what is being looked at by various consumers beyond the spec its self.

Data processing use case Use case for use as a library would be if you have a Java application that you want to do a lot of data processing and manipulation with, and as part of that application you want to run validation. For example, you have an app, and you want to integrate the validator output into your existing data processing pipeline. If you can use the gtfs-validator as a library, you can execute it directly from the Java code and get the results in the form of a List. If you can't use the validator as a library, then in your data pipeline you need to figure out a way to halt your existing pipeline processing, interact with the command-line somehow via scripts to launch the validator, figure out how to wait for the results to be output from the validator (when you don't know how long it's going to take to execute), and then process the JSON output file when validation is complete, and then resume your processing pipeline. Using directly as a library simplifies all this significantly because you can just execute the code synchronously within your own app, and when validation is complete your pipeline just picks back up automatically. An example project that uses the GTFS-rt validator this way is https://github.com/CUTR-at-USF/transit-feed-quality-calculator.

carlfredl avatar Feb 22 '21 20:02 carlfredl

Note that you can also use tools like JitPack to reference gtfs-validator releases (or even the master branch) and use GitHub as if it were Maven Central, if you don't want to build and bundle the JAR file with your application yourself: https://jitpack.io/#MobilityData/gtfs-validator

barbeau avatar Mar 18 '21 17:03 barbeau

I am implementing a data pipeline able to ingest GTFS zip files. In order to validate them before processing them, it would be great for me to have the gtfs-validator available as a library on any public artifact repository (Maven Central for example). It would ease its adoption as a dependency, even if jitpack can be a temporary solution.

BTW, it would be even better if you can only extract the validation stuff from gtfs-validator as a public library without all the CLI support code.

sfalquier avatar Dec 08 '21 10:12 sfalquier

I notice that the jitpack build for v3.0.0 is broken, and therefore I fall back on v2.0.0. Is there any plan about having jitpack build for v3 fixed ?

TonyRoussel avatar Jan 27 '22 13:01 TonyRoussel

Thanks @TonyRoussel for reporting this, after exploration with @barbeau it appears that the update to gradle 7.2 might be the root of the problem We'll work on how to fix this using these indications. I will follow-up here once we have a fix.

lionel-nj avatar Jan 31 '22 21:01 lionel-nj

@TonyRoussel https://github.com/MobilityData/gtfs-validator/pull/1099 has been merged and https://github.com/MobilityData/gtfs-validator/issues/1100 has been closed. You can now use jitpack for the latest build of the validator.

lionel-nj avatar Feb 09 '22 13:02 lionel-nj

Release versions of gtfs-validator (currently 4.0.0 and 4.1.0) have been published to Maven Central (see https://github.com/MobilityData/gtfs-validator/pull/1596)

jcpitre avatar Oct 25 '23 15:10 jcpitre