gtfs-editor
gtfs-editor copied to clipboard
Importing DOTC GTFS Feed (Philippines) DB Colomn Cast Issue
Importing the DOTC GTFS Feed causes an error as follows
03:33:34,817 INFO ~ GtfsImporter: importing stops...
03:33:34,837 WARN ~ SQL Error: 0, SQLState: 42804
03:33:34,837 ERROR ~ ERROR: column "agency_id" is of type bigint but expression is of type bytea
Hint: You will need to rewrite or cast the expression.
Position: 206
03:33:34,838 ERROR ~ javax.persistence.PersistenceException: org.hibernate.exception.SQLGrammarException: could not execute native bulk manipulation query
The AgencyId in the stops.txt file is non numeric
stop_id,stop_code,stop_name,stop_desc,stop_lat,stop_lon,zone_id,stop_url,location_type,parent_station,wheelchair_boarding,stop_direction
LTFRB_4944,,Baclaran LRT,,14.534164423026368,120.99826812744139,,,0,,0,
Ah, I think I understand what's happening here.
It's actually related to a limitation in the GTFS spec that we didn't fully address in the importer. The stops.txt file doesn't actually contain an agency_id. Stops are shared across all agencies in a feed.
In the design of our application we assign stops to a specific agency for administrative clarity, otherwise all agencies would be able to edit all stops in the database, and potentially interfere with data they don't have access to.
To get around this limitation when importing data from GTFS the editor currently assumes that a feed contains only a single agency. In cases where multiple agencies are in the feed that it's failing to assign an agency to the stop which is in turn triggering this database error.
There are a couple potential solutions:
- Separate data to be imported into agency specific feeds and load them sequentially. We can beef up the error handling so that feeds with multiple agencies are flagged when the file is loaded and notify the user that it will not be possible to import.
This is how we've dealt with this problem so far, and mirrors how many feeds are delivered (one feed per agency). This is just a mater of partitioning data.
- Attempt to auto-detect agency-stop relationships via stop_times.txt relationships during import. This isn't foolproof as per GTFS spec stops can have more than one agency. In those cases, as currently designed, the editor would either need to duplicate the stop and assign a copy to each agency or stop the import and raise an error.
This would require at minimum a change to the importer and likely the UI to identify how the data was modified during import.
- Change the editor to allow stops that span agency jurisdiction. Not sure how this would work in the context of allowing agencies access only to their own data. Given the lack of clarity around how to implement this it seems like an inappropriate solution.
Hi Kevin, thanks for the clear explanation!
I thought DOTC Philippines was using the gtfs-fees editor to produce their feeds, but perhaps I am wrong in that.
The point about Multiple Agencies sharing stops is a god one. This is obviously a generic Issue. In the real world, a stop is a stop shared by multiple Agencies, Bus Lines, etc and I guess he last thing you would want is the stop to appear 10 times on a map if 10 bus companies used it and published GTFS feeds for their routes.
Is their a mechanism available to just have generic Stops published and shared for all agencies to use?
I can see this being a real issue here in the philippines due to the volume of stops that exist on the public routes and everyone uses them
Regards
Mark Cupitt
"If we change the world, let it bear the mark of our intelligence"
See me on Open Street Map https://www.openstreetmap.org/user/Mark_Cupitt
See me on LinkedIn http://ph.linkedin.com/in/markcupitt
*See me on StackExchange http://gis.stackexchange.com/users/17846/mark-c *
The contents of this email are intended only for the individual(s) to whom it is addressed and may contain confidential or privileged information. If you are not the intended recipient, you must not disclose, copy, distribute, or use the contents of this email. If you have received this email in error, please notify the sender immediately and
delete the email and any attachments.
On Wed, Jul 9, 2014 at 1:41 AM, Kevin Webb [email protected] wrote:
Ah, I think I understand what's happening here.
It's actually related to a limitation in the GTFS spec that we didn't fully address in the importer. The stops.txt file doesn't actually contain an agency_id. Stops are shared across all agencies in a feed.
In the design of our application we assign stops to a specific agency for administrative clarity, otherwise all agencies would be able to edit all stops in the database, and potentially interfere with data they don't have access to.
To get around this limitation when importing data from GTFS the editor currently assumes that a feed contains only a single agency. In cases where multiple agencies are in the feed that it's failing to assign an agency to the stop which is in turn triggering this database error.
There are a couple potential solutions:
- Separate data to be imported into agency specific feeds and load them sequentially. We can beef up the error handling so that feeds with multiple agencies are flagged when the file is loaded and notify the user that it will not be possible to import.
This is how we've dealt with this problem so far, and mirrors how many feeds are delivered (one feed per agency). This is just a mater of partitioning data.
- Attempt to auto-detect agency-stop relationships via stop_times.txt relationships during import. This isn't foolproof as per GTFS spec stops can have more than one agency. In those cases, as currently designed, the editor would either need to duplicate the stop and assign a copy to each agency or stop the import and raise an error.
This would require at minimum a change to the importer and likely the UI to identify how the data was modified during import.
- Change the editor to allow stops that span agency jurisdiction. Not sure how this would work in the context of allowing agencies access only to their own data. Given the lack of clarity around how to implement this it seems like an inappropriate solution.
— Reply to this email directly or view it on GitHub https://github.com/conveyal/gtfs-editor/issues/73#issuecomment-48373809.
Hi,
I also encountered this problem and tried using the 1st solution. I separated the data into agency specific feed (for the stops.txt), and then changed the stop_ID to only numeric data. I tried importing it and I'm getting the following error: 16:47:37,698 WARN ~ Waiting for snapshotMerge to save... 16:47:38,699 INFO ~ reading entities: org.onebusaway.gtfs.model.Agency 16:47:38,700 INFO ~ reading entities: org.onebusaway.gtfs.model.ShapePoint 16:47:38,700 INFO ~ reading entities: org.onebusaway.gtfs.model.Route 16:47:38,700 INFO ~ reading entities: org.onebusaway.gtfs.model.Stop 16:47:38,701 INFO ~ reading entities: org.onebusaway.gtfs.model.Trip 16:47:38,701 INFO ~ reading entities: org.onebusaway.gtfs.model.StopTime 16:47:38,701 INFO ~ reading entities: org.onebusaway.gtfs.model.ServiceCalendar 16:47:38,702 INFO ~ reading entities: org.onebusaway.gtfs.model.ServiceCalendarDate 16:47:38,702 INFO ~ reading entities: org.onebusaway.gtfs.model.FareAttribute 16:47:38,702 INFO ~ reading entities: org.onebusaway.gtfs.model.FareRule 16:47:38,702 INFO ~ reading entities: org.onebusaway.gtfs.model.Frequency 16:47:38,703 INFO ~ reading entities: org.onebusaway.gtfs.model.Pathway 16:47:38,703 INFO ~ reading entities: org.onebusaway.gtfs.model.Transfer 16:47:38,703 INFO ~ GtfsImporter: importing agencies... 16:47:38,706 ERROR ~ java.lang.NullPointerException
What do you guys think is the problem? Thank you!
+1 for this. I think I've discussed with @kpwebb by email, but I've just come across this again today. I guess the real-life behaviour is that the stop usually belongs to somebody, even if that somebody is just a city council department who puts a bench and a stop sign in place and doesn't actually run any buses. Would it therefore be possible to have some kind of functionality where agencies could have read-only access to each other's stops? (I suppose this is closest to number 3 above)
I am sure that would raise problems I haven't forseen with trying to do exports etc., but they might be addressable.
In the latest master, every stop is owned by an agency, and only that agency can stop there. If you import a GTFS, the stops will be duplicated among all agencies that use them.